During S. pombe S-phase, initiation of DNA replication occurs at multiple sites (origins) that are enriched with AT-rich sequences, at various times. Current studies of genome-wide DNA replication profiles have focused on the DNA replication timing and origin location. However, the replication and/or firing efficiency of the individual origins on the genomic scale remain unclear.
Using the genome-wide ORF-specific DNA microarray analysis, we show that in S. pombe, individual origins fire with varying efficiencies and at different times during S-phase. The increase in DNA copy number plotted as a function of time is approximated to the near-sigmoidal model, when considering the replication start and end timings at individual loci in cells released from HU-arrest. Replication efficiencies differ from origin to origin, depending on the origin's firing efficiency. We have found that DNA replication is inefficient early in S-phase, due to inefficient firing at origins. Efficient replication occurs later, attributed to efficient but late-firing origins. Furthermore, profiles of replication timing in cds1Δ cells are abnormal, due to the failure in resuming replication at the collapsed forks. The majority of the inefficient origins, but not the efficient ones, are found to fire in cds1Δ cells after HU removal, owing to the firing at the remaining unused (inefficient) origins during HU treatment.
Taken together, our results indicate that efficient DNA replication/firing occurs late in S-phase progression in cells after HU removal, due to efficient late-firing origins. Additionally, checkpoint kinase Cds1p is required for maintaining the efficient replication/firing late in S-phase. We further propose that efficient late-firing origins are essential for ensuring completion of DNA duplication by the end of S-phase.
Citation: Eshaghi M, Karuturi RKM, Li J, Chu Z, Liu ET, Liu J (2007) Global Profiling of DNA Replication Timing and Efficiency Reveals that Efficient Replication/Firing Occurs Late during S-Phase in S. pombe. PLoS ONE 2(8): e722. https://doi.org/10.1371/journal.pone.0000722
Academic Editor: Nick Rhind, University of Massachusetts Medical School, United States of America
Received: February 13, 2007; Accepted: July 9, 2007; Published: August 8, 2007
Copyright: © 2007 Eshaghi et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by the Agency for Science, Technology and Research (A-STAR), Singapore.
Competing interests: The authors have declared that no competing interests exist.
DNA replication is a key event in the cell cycle, occurring within a confined period termed S-phase. Replication initiates at various times at multiple sites (origins) in eukaryotic genomes –. Microarray analysis of enriched heavy:light nascent DNA in Saccharomyces cerevisiae has revealed a genomic view of DNA replication timing profiles: some regions of the genome are always replicated early in S phase, some in the middle, and others at the end, due to a strict timing of (efficient) firing at origins . Similar profiles of DNA replication timing have been generated using microarrays that monitor DNA copy number increase without enrichment of nascent DNA in S. cerevisiae and Drosophila melanogaster , . Studies in human HeLa cells have shown that more than 60% of the genome is slowly replicated throughout the S-phase, owing to a flexible timing of firing or inefficient firing at origins . It has been shown that DNA replication/firing in S. pombe is less efficient than that in S. cerevisiae , . Although inefficient origins are not readily detected by classic 2D-gel electrophoresis, a molecular combing technique that analyzes DNA replication/firing at single origins has been able to demonstrate the existence of infrequent firing events at origins in S. pombe .
In S. pombe, replication checkpoint controls mediated via the ATM-related kinase Rad3p (Mec1p in S. cerevisiae) – and checking DNA synthesis kinase Cds1p (Rad53p in S. cerevisiae) , ) play a pivotal role in restoring stalled replication forks and in the regulation of replication timing –. Hydroxyurea (HU) is a commonly used compound that inhibits ribonucleotide reductase (RNR) leading to a depletion of the intracellular pool of dNTPs that induces DNA replication blocks . Patches (∼5 Kb in size) of single-stranded DNA (ssDNA) are found to accumulate at sites of origins upon replication blocks , . The replication checkpoint appears to prevent the accumulation of ssDNA patches at some origins, likely the late-firing ones , consistent with its role in regulation of replication timing. Genome-wide profiling of enriched ssDNA induced by HU treatment has revealed several hundreds of potential origins of replication in both S. cerevisiae and S. pombe .
The genome-wide profiling of replication has been successfully applied to identify early-firing origins in the genome, although it has limitations in predicting the late-firing origins that are closely located to early-firing origins and especially the inefficient late-firing origins . The correlation between the fold-enrichment of HU-induced ssDNA and replication efficiency has been noted for some of the origins ,. Estimation of firing efficiency by the fold-enrichment of ssDNA has been further extended to the whole genome , although a small number of late-firing origins have been found to exhibit a high magnitude of ssDNA formation , . Therefore there remains a need to develop a method to directly determine the firing efficiencies of origins of DNA replication on a genomic scale.
In this study, we have used S. pombe genome-wide ORF-specific DNA microarrays to monitor DNA copy number increase in cells released from HU-block. A near-sigmoid model is applied for the determination of the rate of DNA copy number-increase as a function of time at individual loci across the genome. This is the first-of-its-kind study on assessing both the replication timing and efficiency at the genomic scale. We show that the rate of DNA copy number-increase in cells released after HU-arrest, is generally slow in early S-phase due to the inefficient early-firing at origins. Efficient replication appears to occur late in S-phase. Furthermore, we show that inefficient origins, but not efficient ones, are more likely to fire in cds1Δ cells, attributing to the remaining unused (inefficient) origins after HU removal.
To simplify, the DNA replication process is broken down into three steps: start (i.e., initiation of replication by firing and/or passive replication at individual loci), progression, and end (i.e., completion of replication). We define replication start timing (i.e., firing timing at origin loci or initial passive replication timing at non-origin loci), average replication timing (or half-completion timing), and (full-) completion timing as T0, T50, and T100, respectively. The time period from the start to the end is defined as the duplication time ΔT ( = T100-T0). As one and only one copy of DNA at all loci would be synthesized during the S-phase, ηr · ΔT = 1, where ηr is the (average) replication efficiency. The replication efficiency ηr of the loci at or near the origin sites is the maximal estimate of the firing efficiency ηf of the origins.
Duplication Time Is Extended in rad3Δ and cds1Δ Cells Released after HU-block
Cells bearing a rad3Δ or cds1Δ allele are known to exhibit hypersensitivity to temporal treatment with HU, consistent with their replication checkpoint function. To determine whether DNA duplication could resume in rad3Δ or cds1Δ cells after temporal HU treatment, we treated log-phase growing cells for 3 hr with HU at a final concentration of 8 mM. A typical 2C-DNA content profile was observed for all strains before HU treatment, owing to a short G1-phase (Figure 1). After 3 hr of HU treatment, almost all rad3Δ, cds1Δ, and wild type cells showed a 1C-DNA content profile, indicating an early S-phase arrest.
DNA content profiles by FACS analysis in HU-challenged wild type (A), rad3Δ (B), and cds1Δ (C) cells. 1C and 2C indicate cells containing 1 and 2 copies of genome, respectively. Minus and plus times indicate cells before and after HU-release, respectively. Regression analysis of DNA content increase in wild type (E) and cds1Δ (F) cells. X- and Y-axis indicate the time after HU-removal and the percent of DNA content (i.e., median) increase, respectively.
DNA synthesis gradually resumed in wild type cells released after HU block, as judged by the increase in DNA content determined by FACS analysis (Figure 1A). Approximately 20% of the genome was synthesized in the first 30-min of the S-phase and the remaining 80% was completed in the second 30-min (Figure 1A and 1D), close to the time required for genome duplication in cells with undisturbed S-phase , . The time taken for the genome duplication was ∼60 min in wild type cells released after HU-arrest (Figure 1A). In rad3Δ or cds1Δ cells, on the other hand, less than 20% of the genome was duplicated in the first 60-min after HU removal (Figure 1B, 1C, and 1E). Significantly, the time taken for the entire genome duplication was approximately 4 h in both rad3Δ and cds1Δ cells released after HU block. Given that HU-induced stalled forks collapse and fail to resume DNA replication in rad3Δ and cds1Δ cells after HU-release, this result suggests that post-HU initiation of DNA replication was likely to occur at unused origins.
It is worth noting that ∼20% of cells in the rad3Δ strain, but not cds1Δ, exhibited a cut phenotype (cells with premature assembly of division septa) at ∼2 hr after release from HU treatment (Data not shown). Consistent with this, an additional population of cells containing incomplete genomes (less than 1C-DNA contents) appeared after HU removal in the rad3Δ strain (Figure 1B, see asterisk). This heterogeneity of DNA content impeded determination of DNA replication profiling in rad3Δ cells. Therefore, genome-wide replication profiling of wild type and cds1Δ cells released after HU-block were performed in this study, but profiling of rad3Δ cells was omitted.
Genome-wide Profiling of DNA Replication Timing and Efficiency
To investigate genome-wide DNA replication timing and efficiency, we applied the S. pombe genome-wide ORF-specific microarray  to determine the increase in DNA copy numbers at individual loci in cells released after HU block. The microarray has an average resolution of one locus/∼2.4 Kb. Each locus (or ORF) was represented by two different 50-mer oligonucleotides whose average ratio was used for profiling. Cell samples were taken at 5-min intervals after HU removal for a period of 60 min. Genomic DNA extracted from cell samples was labeled with cyanine-dye Cy5 and subjected to DNA microarray hybridization. To reduce the dye-bias in the dual-color microarray analysis, we applied a common reference of genomic DNA (Cy3-labeled) extracted from cell samples at the 0-min time point. After the 0-min normalization, profiles would not be affected by the type of the common references used (e.g., G1, G2 or asynchronous cells). To ensure reproducibility, two independent time-course experiments were carried out.
Upon HU treatment for up to 3 hr, very little DNA synthesis was detected in check point-proficient cells . Assuming that a replication origin could fire at ∼30% efficiency and accumulate DNA patches of ∼5 Kb in length , a cell containing ∼500 origins would have synthesized ∼750 Kb ( = 30%×5×500) or ∼5% of the genome during the HU treatment. The vast majority (∼95%) of the remaining genome would be synthesized in cells released after HU block. Therefore, the tiny amount of HU-induced DNA deposits at origins is unlikely to distort the global profiles of replication timing and efficiency after the smoothing process (see Materials and Methods).
We applied the near-sigmoid model to fit DNA copy number increase (after 0-min normalization) as a function of time at individual loci for estimation of replication start timing T0 and end timings T100 (exemplified in Figure 2). To this end, the T0 and T100 at the vast majority of loci (>96%) were obtained based on the threshold of FDR (false discovery rate) less than 0.01%. Approximately 190 loci (∼4%) of the genome could not be fit to the near-sigmoid model (see Materials and Methods). After smoothing with a moving window of 3 loci, genome-wide profiles of the average replication timing T50 and the duplication time ΔT (or replication efficiency ηr) were revealed (Figure 2 and 3; Table S1).
(A) Modeling of the near-sigmoid regression for the increase in DNA copy number at indicated individual loci (a–e). (B) Profile assembly of the copy number changes at individual loci of the subchromosomal region. (C) Simplified form of replication profiles displaying average replication timing T50 and duplication time ΔT. (D) Conversion between duplication time ΔT and replication efficiency η. (E) Potential distribution of the T50 versus η at individual loci of the genome.
X- and Y-axis are times after HU removal and physical locations of individual loci of chromosome I (A), II (B), and III (C), respectively. Average replication timing T50 profiles are in think black lines. Efficiency is indicated by T0 and T100 profiles at individual loci in grey lines. Approximately 39 origins that have been validated by 2D-gel electrophoresis are marked with vertical lines. Predicted origins and AT-rich islands are indicated in blue and purple dots on the top of the profiles, respectively. Horizontal orange and green lines at the bottom of the profiles indicate efficient and inefficient subchromosomal regions, respectively. Solid red circles and triangles indicate centromeres and telomeres. No adjacent telomeres indicated in chromosome III due to the presence of multiple arrays of large rDNA sequences at the ends.
We defined loci whose replication efficiency ηr was greater than the median ηr-median (or duplication time ΔT was less than the median ΔTmedian) as efficient-replication loci or efficient loci (Figure 2C, loci b, c, and e). On the other hand, loci whose ηr was less than ηr-median were known as inefficient loci (Figure 2C, loci a and d). It is worth noting that inefficient loci would need to start replication early (through firing at origins and/or passive replication) in order to complete DNA duplication prior to the end of S-phase. This, the T50 at inefficient loci should have a limited range around half the length of the S-phase (Figure 2D and 2E). On the other hand, the T50 at efficient loci would have wide range from early to late in S-phase (Figure 2E). A volcano-like shape illustrates the potential distribution of average-replication timings for all loci with various replication efficiencies (Figure 2E).
Average Replication Timing at Efficient Loci Appears to Be Late in S-phase Progression in Cells Released after HU-block
The profiles of average replication timing T50 and efficiency η (ΔT was also included in Figure 3) revealed that the T50 at all loci of the genome ranged from ∼25.8 to ∼52.5 min with a median of ∼39.2 min at various efficiencies (Figure 4A). The average replication timing T50 varied slightly in different chromosomes. The median of average replication timing T50-median was ∼39.2 and ∼40 min in chromosome I and II, respectively. On the other hand, the T50-median in chromosome III was ∼36.7 min, >2.5 min earlier than the other two chromosomes (p-value ∼ 2.2×10−16) (Figure 4A). This result is consistent with the observation by Heichinger et al. .
Box plot indicates the non-outliers minimum (left end of the line), the lower quartile (left end of the rectangle), median (centre of the rectangle), the upper quartile (right end of the rectangle), and the non-outlier maximum (right end of the line). (A) The box-plots of T50 distribution of individual loci of the genome and individual chromosomes as indicated. The T50-median of the loci of the genome or individual chromosomes is presented in parentheses. (B) The box-plots of T50 distribution of efficient and inefficient loci. Efficient and inefficient loci are those whose η is greater and less than the η-median, respectively. The T50-median of efficient and inefficient loci is indicated in parentheses. (C) Numbers of loci in the binned ΔT series or η series. (D) Plot of efficiency η versus average replication timing T50 of individual loci. Only ∼0.3% (16 out of 4733) loci of the genome exhibited their efficiency η of greater than 1 (or duplicating the DNA in less than 10 min). The arrow indicates the asymmetric distribution of the T50. (E–G) The T50 distribution of loci and the number of loci in the binned η series in chromosome I (E), II (F), and III (G). It is displayed as described in (B) and (C). (H) The plot of T50 versus η or ΔT of efficient and inefficient regions. X- and Y-axis indicate T50 and η or ΔT, respectively. The efficient and inefficient regions are indicated by closed and open circles, respectively.
We next investigated replication efficiency ηr at various loci of the genome. We set the unit of the replication efficiency ηr as 0.1 min−1. Loci whose ηr was greater that the median ηr-median of ∼0.316 (0.1-min−1) were designated as efficient-replication loci or efficient loci. Conversely, inefficient loci were those whose ηr was less than ηr-median. Surprisingly, the median T50-median of the efficient loci (i.e., ∼41.7 min) was substantially later (or greater) than the T50-median of the inefficient loci (i.e., ∼36.7 min), indicating that efficient replication tends to occur late in S-phase (Figure 4B–D; Table 1). The late efficient-replication was a common feature in all three chromosomes (Figure 4E–G). It is probably not surprising that cells would need to synthesize DNA more efficiently in unreplicated loci or gaps late in S-phase. Otherwise a delay in S-phase would be inevitable.
To investigate regulation of replication in subchromosomal regions, we searched for regions (e.g., ∼20 consecutive loci or more) that were enriched either efficient or inefficient loci (p-value<0.05) in the genome. To this end, 23 and 14 subchromosomal regions were found to be efficiently and inefficiently replicated, respectively (Figure 3, see orange and green bars). Inefficient replication regions (5 out of 14) were found to be over-represented in chromosome III (p-value<0.02) (Figure 3C), consistent with the observation that DNA replication is less efficient on chromosome III compared to the other two chromosomes. On the other hand, subtelomeric regions (3 out of 4) in chromosome I and II (chromosome III telomeres are separated from the arms by the multiple arrays of rDNA) were identified as efficient regions that replicated late (Figure 3), consistent with the report on late replication in subtelomeric regions . The average replication timings T50 of the efficient-replication regions appeared to be remarkably later than the T50 of the inefficient regions (p-value<10−15) (Figure 4H).
Chromosomal Locations, Firing Timings and Efficiencies of Potential Replication Origins
Peaks identified from the profiles of average replication timing T50 were commonly used to predict the location of replication origins (or origin clusters). Timing of firing at origins is approximated by the average replication timing of the peak loci. It has been noted that this approach has limitations in identifying (efficient) late-firing origins closely located to the other (efficient) early-firing origins , . As firing at origins is relatively inefficient in S. pombe , , , it would not only underestimate the number of efficient late-firing origins, but also would fail to identify most, if not all, inefficient late-firing origins. This is because inefficient late-firing origins are unlikely to be self-sufficient in replication of the origin DNA (see Material and Methods). Nevertheless, this is still the most effective way to predict origins of replication at the genomic scale .
The PeakFinder software  was applied to identify the position of peaks or potential origins based on the profiles of average replication timing. To this end, ∼516 peaks were identified that represent potential origins (or origin clusters) of replication in the genome (Figure 3, see blue dots on the top of the replication profiles). Of the 48 origins that have been tested by 2D-gel electrophoresis (see ref in Segurado), 39 (∼81%) were found to overlap with the peaks identified in this study with a window of 12-Kb distance (Table 2), and the remaining 15 origins were found to match with peaks a bit below the threshold, suggesting that the peaks of the T50 profiles derived from the T0 and T100 are good approximations of potential origins in the genome.
We estimated the average firing timings T50 and the (maximum) firing efficiencies ηf at the predicted origins using the average replication timing T50 and replication efficiency ηr of the peak loci (Figure 5A). Origins whose firing efficiency ηf was greater than the median ηf-median of 0.316 (0.1 min−1) were designated as efficient origins (the ηf-median is different from the ηr-median) (Table 1). On the other hand, origins whose ηf was less than the ηf-median were classified as inefficient origins. Notably, most inefficient origins fired near the middle of the S-phase (Figure 5B and 5C). It is intriguing that most efficient origins appeared to fire late in S-phase in cells released after HU block (Figure 5B and 5C), given that the method applied was known to favor early-firing origins ,
(A) The box-plots of T50 distribution of predicted origins of the genome and individual chromosomes as indicated. The T50-median of the origins of the genome or individual chromosomes is presented in parentheses. (B) The box-plots of T50 distribution of efficient and inefficient origins. Efficient and inefficient origins are those whose η is greater and less than the η-median, respectively. The T50-median of efficient and inefficient origins is indicated in parentheses. (C) Plot of firing efficiency η versus average firing timing T50 of individual origins. The arrow indicates the asymmetric distribution of the T50. (D) The Venn diagram of predicted peaks and loci containing HU-induced ssDNA deposit. (E) The over-representation of ssDNA in predicated origins of chromosome III. Numbers of the predicated origins/peaks and those overlapped with the loci containing ssDNA deposit are indicated. The asterisk indicates the significant enrichment of ssDNA in chromosome III.
Rad3p-Cds1p mediated checkpoint is known to restore stalled forks during HU-induced DNA replication block , , . To validate that the majority of the stalled forks resumed DNA replication in cells released from a HU-block, we first determined the sites of DNA synthesis in cells treated with 8-mM HU for 3 h (equivalent to the cells at 0-min time point). To achieve this, microarray analysis was performed using Cy5-labeled sample DNA prepared from HU-treated cells (i.e., 0-min samples) against Cy3-labeled reference DNA from G1- or G2-phase cells. Self-hybridizations of genomic DNA from G1 or G2-phase cells were used as control. In either case, four independent microarray hybridizations were carried out for reproducibility. Loci containing extensive stretches of newly synthesized DNA, resulting from firing at origins in HU-treated cells, would exhibit an obvious increase in DNA copy number. SAM procedure  was applied to identify loci with significant amounts of nascent DNA by comparing the ratios of copy number at individual loci in HU-treated experiments with the self-hybridization control experiments. As a result, ∼394 loci were found to have large amounts of newly synthesized DNA at the threshold of 10% FDR (Figure 5D). More than 80% of the loci containing nascent DNA after HU-release overlapped with the predicted origins derived from the peaks of the T50 profiles with a window of 12-Kb distance, indicating that the majority of the stalled forks are rendered competent for DNA replication in checkpoint-proficient cells released from a HU block. The HU-induced nascent ssDNA stretches were found to be over-represented on chromosome III, consistent with the observation of early replication/firing in chromosome III (Figure 5E, see asterisk).
To investigate whether the peaks derived from the T50 (i.e., the average of T0 and T100) profiles were similar to the origins predicted by other studies, we compared the 516 peaks with the reported origin sets based on an in-silico study , a chIP-chip analysis , an HU-induced ssDNA enrichment , and from replication profiling in unperturbed S-phase  (Table S2). The majority (∼80%) of the origins in the previous reports overlapped with the 516 peaks, indicating an adequate prediction of origins using the near-sigmoid approach (Table 2).
Late Efficient Replication Patterns Are Disrupted in HU-challenged cds1Δ Cells
To investigate the role of Cds1p-mediated checkpoint function in regulation of global DNA replication, we performed DNA replication profiling in cds1Δ cells released from a HU block. To this end, HU-challenged cds1Δ cells were sampled at 15-min intervals for a period of 4 h. Microarray analysis of DNA copy number-increases in cds1Δ cells was performed using an identical method for the wild type (see above). Genome-wide profiles of DNA replication timing and efficiency in cds1Δ cells are shown in Figure 6 (Table S3). It took ∼240 min to complete DNA replication in cds1Δ cells released from a HU block, 4 times longer than HU-challenged wild type cells. The profiles in cds1Δ cells were disrupted from those in wild type cells, indicating that Cds1p function is involved in maintaining global regulation of DNA replication.
The profiles are displayed as described in Figure 3.
It was clear that the average replication timing T50 at various loci ranged from ∼70 min to ∼135 min with the median T50-median of 100 min according to the replication profiles in HU-challenged cds1Δ cells (Figure 7A). The global profiles were totally disrupted in cds1Δ cells. However, the early replication of chromosome III (i.e., T50-median = ∼97.5 min) compared to that of the other two chromosomes was not altered in cds1Δ cells (Figure 7A), suggesting that the early replication of chromosome III is independent on the Cds1p function. As DNA duplication in HU-challenged cds1Δ cells took ∼4 times longer than that of wild type cells, the unit of replication efficiency ηr was thus defined as 1/40 (or 0.025) min−1, four-fold less efficient than wild type cells. The relative replication efficiency at various loci in cds1Δ cells was categorized based on the median ηr-median ∼0.267 (0.025 min−1). The T50-median of efficient loci in cds1Δ cells was very similar to that of inefficient loci (Figure 7B–D), indicating the late efficient-replication origins were most affected by the lack of Cds1p after release from a HU block (Figure 4B–D). This result suggests that the failure in resuming DNA replication at the forks that previously initiated DNA replication, would extend the duration of S-phase. Alternatively, it may imply that the checkpoint function is not only required for restoring stalled forks, but also involved in ensuring efficient replication late in S-phase.
(A) The box-plots of T50 distribution of individual loci of the genome and individual chromosomes in cds1Δ cells as indicated. The T50-median of the loci of the genome or individual chromosomes is presented in parentheses. (B) The box-plots of T50 distribution of efficient and inefficient loci in cds1Δ cells. (C) Numbers of loci in the binned ΔT series or η series. (D) Plot of efficiency η versus average replication timing T50 of individual loci in cds1Δ cells. Less than ∼0.1% (3 out of 4787) loci of the genome exhibited their efficiency η of greater than 1 (or duplicating the DNA in less than 40 min). The arrow indicates the asymmetric distribution of the T50. (E) The plot of T50 versus η or ΔT of efficient and inefficient regions in cds1Δ cells. X- and Y-axis indicate T50 and η or ΔT, respectively. The efficient and inefficient regions are indicated by closed and open circles, respectively.
We subsequently searched for subchromosomal regions that were clustered for efficient and inefficient replication loci in cds1Δ cells using the identical approach in wild type. We identified 11 efficient and 10 inefficient regions. Inefficient replication regions were over-represented on chromosome III, similar to wild type cells. The average replication timing, T50, of efficient regions was comparable to that of inefficient regions in cds1Δ cells (Figure 7E). The late-efficient replication pattern in wild type was clearly disrupted in cds1Δ cells, indicating that the involvement of Cds1p function in maintaining the efficient replication late in S-phase.
Checkpoint Function Is More Apparent on Efficient Origins than Inefficient Ones in cds1Δ cells Released From a HU-block
Chromosomal locations of origins were approximated by the peaks identified using the Peakfinder software  based on the profiles of the average replication timing T50 in cds1Δ cells. A total of 598 peaks were identified from the profiles in HU-challenged cds1Δ cells: 269, 226, and 103 were found in chromosome I, II, and III, respectively (Figure 8A–8C). The number of origins identified in cds1Δ profiles was ∼14% more than that found in wild type profiles.
(A) The box-plots of T50 distribution of predicted origins of the genome and individual chromosomes cds1Δ cells as indicated. The T50-median of the origins of the genome or individual chromosomes is presented in parentheses. (B) The box-plots of T50 distribution of efficient and inefficient origins cds1Δ cells. The T50-median of efficient and inefficient origins is indicated in parentheses. (C) Plot of firing efficiency η versus average firing timing T50 of individual origins cds1Δ cells. The arrow indicates the asymmetric distribution of the T50. (D) Venn diagram of peaks predicted in wild type and cds1Δ cells. Overlapping peaks are those whose distance is less than 12 Kb. (E) The T50 distribution of the wild type-unique and overlapping origins in wild type cells. The asterisk indicates the significant increase in firing efficiency of the wild type-unique origins. (F) The T50 distribution of the cds1-unique and overlapping origins in cds1Δ cells.
It has been shown that Cds1p-mediated checkpoint is required for restoring stalled forks in cells treated with HU . Stalled forks would collapse in cds1Δ cells and fail to resume replication. Assuming that the same subset of origins has fired and gave rise to nascent ssDNA stretches of ∼5 Kb in size in either wild type or cds1Δ cells during HU treatment, we would expect that the stalled forks close to the origins would be able to resume replication in wild type cells but not in cds1Δ cells, representing a subset of the wild type-unique origins.
To identify the wild type-unique origins, chromosomal location of origins in wild type cells was compared to that in cds1Δ cells using the threshold of 12 Kb-distance (see Materials and Methods). Surprisingly, less than 10% origins (48 out of 516) were found to be wild type-unique (Figure 8D). The majority of the predicted origins were found to overlap between the wild type and cds1Δ cells with the window of 12-Kb distance (Figure 8D). It is very unlikely that the majority of origins would have been reserved (not fired) during HU treatment (for 3 h) and could fire in either strain after HU removal. Given that firing efficiency at origins in general is low in S. pombe , it is possible that those overlapping origins represent a subset of inefficient origins.
To test whether overlapping origins were inefficient, we performed comparison analysis of the firing efficiencies between the two origin-subsets: the wild type-unique and the overlapping origins. As expected, the firing efficiency of the wild type-unique origins appeared to be higher than that of overlapping ones (p-value ∼7.3×10−3), consistent with the notion that Cds1p checkpoint function is essential for restoring stalled forks induced by HU treatment (Figure 8E). The result indicates that checkpoint function has a greater effect on efficient origins compared to inefficient ones. A small subset of cds1-unique origins was found in cds1Δ cells (Figure 8F). These cds1-unique origins might represent the late inefficient origins which failed to be identified in wild type cells.
Genome-wide microarray analyses have been widely used to determine profiles of average DNA replication timing at the genomic scale –, . However, direct estimation of replication efficiency at various loci of the genome based on the genome-wide replication profiles has not been performed previously. In this study, we demonstrate that replication efficiency, together with average replication timing, can be estimated using a novel approach-the near-sigmoid fitting for the increase in DNA copy number as a function of time at individual loci (see Materials and Methods). The near-sigmoid model approach permits estimation of replication start timing T0 and replication end timing T100 at various loci of the genome. Based on the T0 and T100 at each locus, we attain the genome-wide profiles of average replication time T50 () and replication efficiency η ().
Genome-wide profiles of average replication timing T50 and replication efficiency η were determined in wild type and cds1Δ cells released from a HU block (Figure 3 and Figure 6). DNA duplication takes about ∼60 min in wild type cell and ∼240 min cds1Δ cells after HU removal (Figure 1). The time taken to complete duplication at the majority of loci ranges from ∼15.8 min to ∼53.8 min with the median of 39.2 min in wild type cells and from ∼37.5 min to ∼157.5 min with the median of 150 min in cds1Δ cells. The relative rate of DNA copy number-increase is used to define replication efficiencies in wild type (i.e., 1/10-min−1) and cds1Δ (i.e., 1/40-min−1) cells. The medians of replication efficiency ηr-median of 0.255/10-min and ηr-median of 0.267/40-min are used to categorize efficient or inefficient loci in wild type and cds1Δ cells, respectively (Table 1). Significantly, efficient loci (whose η>0.255/10-min) in wild type cells tend to be replicated late while inefficient ones (whose η<0.255/10-min) early in S-phase (Figure 4). However, efficient loci (η>0.267/40-min) in cds1Δ cells do not show late replication when compared to inefficient loci (η<0.267/40-min) (Figure 7). It is not clear whether the disruption of the late-efficient replication in cds1Δ cells is solely due to the failure in resuming replication from the collapsed forks. Alternatively, Cds1p checkpoint kinase may play a role in regulation of late-efficient replication in cells after HU removal.
It is intriguing to find that efficient replication occurs late in S-phase. To ensure completion of DNA duplication by the end of S-phase in an inefficient replication system such as S. pombe , , it is necessary to increase the efficiency of origins that fire late in S-phase. Disruption of the late efficient replication would lead to a delay in completion of duplication or S-phase. It is most likely that the extended S-phase in cds1Δ cells after HU removal is partly attributed to the disruption of the late efficient replication.
To assess whether the HU-induced newly-replicated DNA would distort the profiles of replication timing and efficiency in cells released from a HU block, we performed microarray analysis to compare DNA extracted from HU-treated cells and G1 or G2 cells. Assuming that ∼500 origins fire at an average efficiency of ∼30% to produce newly-replicated DNA stretches of ∼5 Kb in size , ∼5% of the genome would have been synthesized after HU treatment for 3 h. Approximately 394 loci are found to have newly-replicated DNA, out of which, 318 (80%) co-localize to the peaks of the T50 profiles with a window of 12-Kb distance. This indicates that the replication/resumption profiles generated based on the common reference of 0-min samples are a fair representation of the actual replication profiles.
Peaks identified from the global profiles of DNA replication timing T50 have been commonly used to estimate origins of replication , . We have adopted this approach to identify peaks from the T50 profiles in wild type and cds1Δ cells. A total of 516 and 598 peaks were identified in wild type and cds1Δ cells, respectively. Origins that initiated DNA replication prior to HU treatment could resume replication in wild type cells after HU removal. These origins would presumably be readily identifiable. However, forks resulting form origins that initiated DNA replication prior to HU treatment would fail to resume replication in cds1Δ cells due to collapse and therefore are unlikely to be identified. To investigate whether we identified different sets of origins in wild type and cds1Δ cells, we compared the actual locations of peaks in wild type and cds1Δ cells. Significantly, the majority of peaks overlapped with a window of 12-Kb distance between the two sets of origins (Figure 8). This may not be surprising in S. pombe in which firing is inefficient; and initiation of DNA replication from inefficient origins would not be effective during HU treatment. Therefore, after HU removal, the remaining unused origins could be competent to initiate DNA replication in both wild type and cds1Δ cells. This is further supported by the observation that origins found in wild type but not in cds1Δ cells are efficient origins (Figure 8E). These results are consistent with the notion that replication in S. pombe is generally considered inefficient .
Heichinger et al., has recently reported the profiling of average DNA replication timings using cdc25-22 cells after temperature block-and-release . The firing efficiency at various origins is estimated based on the fold-enrichment of HU-induced ssDNA. It was thus concluded that the early-firing origins are more efficient that the late-firing ones. The discrepancy may be explained by the fact that the late-firing origins are inefficient in cells undergoing unperturbed S-phase. However, in checkpoint activated cells, DNA replication from efficient origins are delayed.
We present a first-of-its-kind study that permit direct assessment of genome-wide replication/firing efficiency at individual loci/origins. Opposed to well-defined, site-specific, and efficient (∼90%) origins in budding yeast, origins of replication in S. pombe appear to be located preferentially at A+T-rich regions in the genome with an average firing efficiency of ∼30% , , . Approximately 50% of the loci have shown to possess a low level (below the threshold) of HU-induced newly replicated DNA (Eshaghi and Liu, unpublished data), supporting the idea that about half of the genome (i.e., all intergenic sequences) have regions that could potentially act as sites for initiation of DNA replication . Our results support the stochastic model ,  in which random gaps are likely to be filled by the efficient late-firing origins (Figure 5).
Materials and Methods
Strains and Culture Manipulations
Strains YJL188 (leu1-32 ura4-D18 h−), YJL1687 (leu1-32 ura4-D18 cds1Δ::ura4+ h−), YJL1715 (leu1-32 ura4-D18 rad3Δ::ura4+ h−), and medium YES  were used in this study.
To monitor DNA replication in hydroxyurea (HU) (Sigma-Aldrich Corp. St. Louis, MO)-challenged cells, ∼800 ml of S. pombe culture in YES supplemented with 8mM HU were grown at 30°C in an orbital rotating shaker (New Brunswick Scientific Co., Inc., Edison, NJ) for 3 hrs. The HU-challenged cells were harvested by centrifugation (Thermo Scientific, Inc., Waltham, MA) at 5,000 rpm, 4°C for 1 min and subsequently washed twice with 4°C pre-chilled YES. The cells were resuspended to the original volume of YES medium. Samples were taken at 5-min intervals for a period of 60 min. For DNA content analysis, cell samples (∼5 ml each) were spun down, resuspended in ice-cold 70% ethanol and stored at 4°C for subsequent analysis. For genomic DNA extraction and analysis of copy number, cell samples (∼40 ml each) were spun down, quickly chilled in liquid nitrogen and stored at −80°C for further analysis.
Fluorescence-associated Cell Sorting (FACS) Analysis
To analyze DNA content, cells were fixed in ice-cold 70% ethanol spun down and resuspended in 50 mM sodium citrate buffer pH 7.0. To remove RNA, the cell suspension was treated with RNase A (Sigma) at a final concentration of 100 µg/ml and left at room temperature overnight. Cells were subsequently washed with 50 mM sodium citrate buffer and stained with propidium iodine (Sigma) at a final concentration of 8 µg/ml. Fluorescence intensities of individual cells were measured by BD FACScan flow cytometry (BD Biosciences, Franklin Lakes, NJ).
Genomic DNA preparation
The frozen cell samples were thawed and washed with SE buffer (1.2 M sorbitol/0.1 M EDTA pH 8). The cells were subsequently resuspended in 1 ml SE buffer supplemented with 2 µg/ml Zymolyase 100T (ICN Biomedicals, Inc., Costa Mesa, CA) and incubated at 37°C for ∼15 min. The spheroplasts were centrifuged at 2,000 rpm for 5 min and the supernatant was decanted. The spheroplasts were resuspended in 1 ml TNE buffer (50 mM Tris-HCl/100 mM NaCl/50 mM EDTA pH8) supplemented with 0.5% SDS and 1 µg/ml protease K (Roche, Basel, Switzerland) and incubated at 65°C for 30 min. To precipitate proteins, the cell lysates were first chilled on ice, mixed with pre-cooled 100 µl of 5 M Potassium Acetate, and incubated on ice for 30 min. Proteins were removed by centrifuging at 14,000 rpm for 15 min. The supernatant containing DNA was transferred into fresh tubes. The DNA was precipitated by the addition of equal volume of isopropanol and incubated at room temperature for 30 min. DNA was centrifuged and washed with 70% cold ethanol. The DNA pellet was subsequently resuspended in TE buffer supplemented with 0.1 µg/ml RNase A (Roche) and incubated at room temperature for 1 hr. DNA was extracted by phenol/chloroform/isoamyl alcohol (25∶14∶1) (Fluka Chemical Corp, Milwaukee, WI) extraction and precipitated by ethanol or isopropanol. The precipitated DNA was resuspended in TE buffer and fragmented by sonication (Branson, Danbury, CT) at 20% strength using three 20 sec bursts.
Microarray Manufacture and Hybridization
To manufacture ORF-specific DNA microarrays, approximately 10,000 oligonucleotides (50-mer in length) that represented 4,929 ORFs (one ORF is represented by two oligomers) in the S. pombe genome  were synthesized (Proligo, Hamburg, Germany). The oligonucleotides were resuspended in 0.3× SSC buffer and spotted onto poly-lysine-coated glass slides by using an arrayer machine (GeneMachines, San Carlos, CA). Spotted glass slides were processed according to DeRisi's protocol (http://derisilab.ucsf.edu).
Sheared DNA was labeled using a random-priming protocol with amino-allyl-dUTP (aa-dUTP) using the BioPrime DNA Labeling kit (Invitrogen Corporation, Carlsbad, CA) according the manufacture's instruction. Labeled DNA was extracted with phenol/chloroform/isoamyl alcohol and precipitated with ethanol or isopropanol. The labeled DNA was then coupled with cyanine dyes Cy5 (Amersham, Buckinghamshire, UK) (e.g. DNA samples at various time points after HU removal) or Cy3 (e.g. DNA samples at the 0-min time point as common reference) according to a standard protocol. Cy5 and Cy3 labeled DNA were co-hybridized to microarrays.
Microarray Data Acquisition and LOWESS Normalization
Microarray slides were scanned using a GenePix scanner (Axon Instruments, Union City, CA) that was controlled by the software GenePix Pro4 (Axon Instruments). GenePix Results files were generated and globally normalized based on a median of ratios. To ensure accuracy of the measured data, the ratio of a feature was collected if its intensity in either channel was two-fold or greater than that of the background. To eliminate intensity-dependent dye bias, individual microarrays were further normalized using locally weighted linear regression and smooth scatter plot (LOWESS) .
Genome-wide Profiling of DNA Replication Timing and Efficiency
All steps in the process (including DNA-content-based microarray data normalization, regression of near-sigmoidal models to estimate replication initiation timing, completion timing and prediction of peaks using PeakFinder) were performed as follows.
DNA-content-based data normalization.
To determine DNA copy number increase at various loci, individual microarrays were normalized based on the DNA content at respective time points as described elsewhere . In brief, the DNA content profiles were obtained through FACS analysis in two independent time series experiments. Each series was first normalized to 0-mean unit-variance to eliminate series dependent variation of mean and variance. Subsequently the logistic regression was carried out using SigmaPlot 8.0 software (Systat Software Inc., San Jose, CA) in which we fitted a 4-parameter near-sigmoid curve shown (Figure 1D and 1E) by equation (1) for the two normalized DNA content series together by estimating the parameters using least squares method.(1)Where f(t) is the estimated DNA-content at time “t”. a is the scale parameter; b is the rate of DNA content increase; t0 is the replication half-completion timing; and y0 is the initial DNA content. Considering that the initial DNA content and the amount of increase of DNA after replication completion is from 1 copy to 2 copies of genome, we therefore simplified the equation (1) as(2)Where y0 = 1 and a = 1 in (1). The median of ratios of individual microarrays was adjusted to be the logarithm of the estimated DNA content at respective time points from equation (2). That is,(3)Where Dit and Cit are ratios of locus i at time t before and after the normalization by equation (3), respectively. Normalized datasets were applied for further analysis of replication timing and efficiency.
Regression analysis using the near-sigmoidal model.
To estimate replication initiation and completion timings, the increase of copy number at individual loci, as a function of time, was subjected to a regression fitting for a 3-piece linear or near-sigmoidal model as an approximation of the sigmoidal. Before regression analysis, the datasets was first 0-mean and unit-variance normalized. Subsequently, missing values were filled by averaging neighbor loci log-ratios for regression analysis. In brief, each locus' log-ratio was fitted to a near-sigmoidal model as in equation (4) using the least square method. T0 and T100 indicate replication initiation timing and completion timing, respectively. Thus, T50 is the average of T0 and T100, i.e., (T0+T100)/2. The duplication time ΔT = T100-T0 is a measure of efficiency for DNA replication. That is, the greater the duplication time ΔT is, the less efficient the DNA replication process is at the locus.(4)Where Mit is the model of log-ratio of locus i at time t. All other estimated parameters are for the locus i in the notations. Ti0 and Ti100 are replication initiation timing and completion timing, respectively. EiL and EiU are the lower and upper limits of replication, respectively. KL and KU are the number of observations for locus i up to T0 and after T100, respectively. The values are estimated by exhaustive search to minimize the MSE (mean-square error) between Mit and log(Cit), i.e., where K is the total number of observations for locus i at all time points together. The p-value of the fit was calculated using an ANOVA table. To eliminate errors, the estimates of T50 and ΔT were smoothed using a moving average of 3 loci.
Prediction of peaks.
To predict peaks or potential origins, we applied PeakFinder  to identify peaks based on the profiles of replication half-completion timing T50. We set the parameters of the software as Gaussian Smoothing, Raw data, N = 1, round = 1, Left Delta = Right Delta = 0. The efficiency of origins was approximated using the efficiency of loci at the origins or closest to the origins.
Identification of efficient and inefficient replication regions.
To identify efficient and inefficient regions, we applied a sliding window to score all the loci by the following equation: (5)where m is the median of ΔT at all loci; and w is the half window size, so we defined the efficient region in which all loci had a score that was more than the certain threshold α; and the inefficient region in which all loci had a score that was less than −α. In this study, we set w = 5, α = 4, and the minimal number of loci in a region is 19. The median of ΔT at all loci is ∼31.6 min in wild type cells and ∼150 min in cds1Δ cells.
A standard binomial test was employed to examine the significant enrichment, and the Wilcoxon rank test was employed to examine the differences between two groups with unknown distribution at various significance levels.
The completed microarray datasets have been submitted to the GEO Databases (the accession number is GSE6977).
Profiles of replication timing and efficiency in wild type cells
(0.33 MB DOC)
Comparisons of peaks/ORI between this study and the studies previously published by others
(0.03 MB DOC)
We thank L. D. Miller and X. Peng for manufacturing S. pombe DNA microarrays. The authors appreciate C.-Y. Wong, P. Robson, V.M. D'Souza, and T. Lufkin for critical reading and discussion of the manuscript. The author would also like to thank the anonymous reviewers for critical comments that greatly improved the manuscript.
Conceived and designed the experiments: J Liu. Performed the experiments: ME. Analyzed the data: RK J Li J Liu ME. Contributed reagents/materials/analysis tools: ZC. Wrote the paper: J Liu. Other: Involved in the design of the S. pombe DNA microarray: EL.
- 1. Gilbert DM (2001) Nuclear position leaves its mark on replication timing. J Cell Biol 152: F11–15.
- 2. MacAlpine DM, Bell SP (2005) A genomic view of eukaryotic DNA replication. Chromosome Res 13: 309–326.
- 3. Bell SP, Dutta A (2002) DNA replication in eukaryotic cells. Annu Rev Biochem 71: 333–374.
- 4. Raghuraman MK, Winzeler EA, Collingwood D, Hunt S, Wodicka L, et al. (2001) Replication dynamics of the yeast genome. Science 294: 115–121.
- 5. Schubeler D, Scalzo D, Kooperberg C, van Steensel B, Delrow J, et al. (2002) Genome-wide DNA replication profile for Drosophila melanogaster: a link between transcription and replication timing. Nat Genet 32: 438–442.
- 6. Yabuki N, Terashima H, Kitada K (2002) Mapping of early firing origins on a replication profile of budding yeast. Genes Cells 7: 781–789.
- 7. Jeon Y, Bekiranov S, Karnani N, Kapranov P, Ghosh S, et al. (2005) Temporal profile of replication of human chromosomes. Proc Natl Acad Sci U S A 102: 6419–6424.
- 8. Kim SM, Huberman JA (2001) Regulation of replication timing in fission yeast. Embo J 20: 6115–6126.
- 9. Patel PK, Arcangioli B, Baker SP, Bensimon A, Rhind N (2006) DNA replication origins fire stochastically in fission yeast. Mol Biol Cell 17: 308–316.
- 10. Enoch T, Carr AM, Nurse P (1992) Fission yeast genes involved in coupling mitosis to completion of DNA replication. Genes Dev 6: 2035–2046.
- 11. Seaton BL, Yucel J, Sunnerhagen P, Subramani S (1992) Isolation and characterization of the Schizosaccharomyces pombe rad3 gene, involved in the DNA damage and DNA synthesis checkpoints. Gene 119: 83–89.
- 12. al-Khodairy F, Fotou E, Sheldrick KS, Griffiths DJ, Lehmann AR, et al. (1994) Identification and characterization of new elements involved in checkpoint and feedback controls in fission yeast. Mol Biol Cell 5: 147–160.
- 13. Murakami H, Okayama H (1995) A kinase from fission yeast responsible for blocking mitosis in S phase. Nature 374: 817–819.
- 14. Lindsay HD, Griffiths DJ, Edwards RJ, Christensen PU, Murray JM, et al. (1998) S-phase-specific activation of Cds1 kinase defines a subpathway of the checkpoint response in Schizosaccharomyces pombe. Genes Dev 12: 382–395.
- 15. Sogo JM, Lopes M, Foiani M (2002) Fork reversal and ssDNA accumulation at stalled replication forks owing to checkpoint defects. Science 297: 599–602.
- 16. Boddy MN, Furnari B, Mondesert O, Russell P (1998) Replication checkpoint enforced by kinases Cds1 and Chk1. Science 280: 909–912.
- 17. Eklund H, Uhlin U, Farnegardh M, Logan DT, Nordlund P (2001) Structure and function of the radical enzyme ribonucleotide reductase. Prog Biophys Mol Biol 77: 177–268.
- 18. Shirahige K, Hori Y, Shiraishi K, Yamashita M, Takahashi K, et al. (1998) Regulation of DNA-replication origins during cell-cycle progression. Nature 395: 618–621.
- 19. Feng W, Collingwood D, Boeck ME, Fox LA, Alvino GM, et al. (2006) Genomic mapping of single-stranded DNA in hydroxyurea-challenged yeasts identifies origins of replication. Nat Cell Biol 8: 148–155.
- 20. Santocanale C, Diffley JF (1998) A Mec1- and Rad53-dependent checkpoint controls late-firing origins of DNA replication. Nature 395: 615–618.
- 21. Heichinger C, Penkett CJ, Bahler J, Nurse P (2006) Genome-wide characterization of fission yeast DNA replication origins. Embo J 25: 5171–5179.
- 22. MacNeill SA, Fantes PA (1997) Genetic and physiological analysis of DNA replication in fission yeast. Methods Enzymol 283: 440–459.
- 23. Peng X, Karuturi RK, Miller LD, Lin K, Jia Y, et al. (2005) Identification of cell cycle-regulated genes in fission yeast. Mol Biol Cell 16: 1026–1042.
- 24. Rhind N (2006) DNA replication timing: random thoughts about origin firing. Nat Cell Biol 8: 1313–1316.
- 25. Glynn EF, Megee PC, Yu HG, Mistrot C, Unal E, et al. (2004) Genome-wide mapping of the cohesin complex in the yeast Saccharomyces cerevisiae. PLoS Biol 2: E259.
- 26. Yang YH, Dudoit S, Luu P, Lin DM, Peng V, et al. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 30: e15.
- 27. Segurado M, de Luis A, Antequera F (2003) Genome-wide distribution of DNA replication origins at A+T-rich islands in Schizosaccharomyces pombe. EMBO Rep 4: 1048–1053.
- 28. Hayashi M, Katou Y, Itoh T, Tazumi M, Yamada Y, et al. (2007) Genome-wide localization of pre-RC sites and identification of replication origins in fission yeast. Embo J 26: 1327–1339.
- 29. Dai J, Chuang RY, Kelly TJ (2005) DNA replication origins in the Schizosaccharomyces pombe genome. Proc Natl Acad Sci U S A 102: 337–342.
- 30. Legouras I, Xouri G, Dimopoulos S, Lygeros J, Lygerou Z (2006) DNA replication in the fission yeast: robustness in the face of uncertainty. Yeast 23: 951–962.
- 31. Moreno S, Klar A, Nurse P (1991) Molecular genetic analysis of fission yeast Schizosaccharomyces pombe. Methods Enzymol 194: 795–823.
- 32. Wood V, Gwilliam R, Rajandream MA, Lyne M, Lyne R, et al. (2002) The genome sequence of Schizosaccharomyces pombe. Nature 415: 871–880.