Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Biochemical reconstitution of heat-induced mutational processes

  • Tomohiko Sugiyama

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    sugiyama@ohio.edu

    Affiliations Department of Biological Sciences, Ohio University, Athens, Ohio, United States of America, Molecular and Cellular Biology Graduate Program, Ohio University, Athens, Ohio, United States of America

Abstract

Non-enzymatic spontaneous deamination of 5-methylcytosine, producing thymine, is the proposed etiology of cancer mutational signature 1, which is the most predominant signature in all cancers. Here, the proposed mutational process was reconstituted using synthetic DNA and purified proteins. First, single-stranded DNA containing 5-methylcytosine at CpG context was incubated at an elevated temperature to accelerate spontaneous DNA damage. Then, the DNA was treated with uracil DNA glycosylase to remove uracil residues that were formed by deamination of cytosine. The resulting DNA was then used as a template for DNA synthesis by yeast DNA polymerase δ. The DNA products were analyzed by next-generation DNA sequencing, and mutation frequencies were quantified. The observed mutations after this process were exclusively C>T mutations at CpG context, which was very similar to signature 1. When 5-methylcytosine modification and uracil DNA glycosylase were both omitted, C>T mutations were produced on C residues in all sequence contexts, but these mutations were diminished by uracil DNA glycosylase-treatment. These results indicate that the CpG>TpG mutations were produced by the deamination of 5-methylcytosine. Additional mutations, mainly C>G, were introduced by yeast DNA polymerase ζ on the heat-damaged DNA, indicating that G residues of the templates were also damaged. However, the damage on G residues was not converted to mutations with DNA polymerase δ or ε.

Introduction

Hydrolytic deamination of cytosine (C), producing uracil (U), is one of the most common non-enzymatic decompositions of DNA bases [1]. Since U residues in cellular DNA can pair with incoming dAMP during DNA replication, unrepaired C deamination can readily cause C>T mutations. However, the vast majority of U residues in DNA are removed by a strong activity of cellular uracil DNA glycosylase (Udg) that is a part of the base-excision repair mechanism. In the human genome, about 80% of C residues in CpG context have 5-methylcytosine (5-meC) modification [2, 3]. While this modification is crucial for genomic imprinting and some specific gene regulations [4], deamination of 5-meC produces a T residue that potentially produces T-G mispairs in double stranded DNA. While T-G mispairs can be recognized by thymine DNA glycosylase (TDG) or methyl-CpG binding domain protein 4 (MBD4) [5, 6], these processes are considered less efficient than repairing U residues in DNA. Furthermore, deamination occurs more efficiently in vitro at 5-meC than at unmodified C [7]. Consistently, C>T at CpG sequence is the most observed spontaneous mutation in the HPRT gene of cultured human cells [8].

Massive analyses of human cancer genomes revealed multiple recurring patterns of mutations in cancers, termed cancer mutational signatures [9, 10]. Among about 60 distinct cancer signatures, the most prevalent one in all cancers (signature 1) has C>T mutation in CpG context (CpG > T). From the pattern of mutations, this signature was proposed to be caused by deamination of 5-meC [9]. Signature 1 is observed in any cancer type as well as in non-cancer tissues, and the mutation frequency increases with the age of the sample donor [1113]. These characteristics are consistent with the nature of spontaneous deamination of 5-meC.

Direct analyses of DNA damages in cell-free systems revealed that the spontaneous deamination of C and 5-meC can be accelerated with temperature, following first order kinetics [1416]. The reaction occurs much faster in single-stranded DNA (ssDNA) than in double-stranded DNA (dsDNA) [17, 18]. The temperature and reaction rate have a linear relationship in the Arrhenius plot, and extrapolation of the plot indicated that an estimated half-life of C residue was 20–100 years in ssDNA at 37°C at neutral pH [7, 19, 20]. Experiments at high temperature have also identified other DNA decompositions, including depurination/depyrimidination producing abasic sites, and deamination of G and A residues producing xanthine and hypoxanthine, respectively [19]. Rates of deamination of G and A residues are 50 to 100-fold lower than that of C-deamination [19], and the contribution of these damages in spontaneous mutations has not been clear.

In this work, a process of heat-induced mutagenesis was reconstituted in a cell-free system. Synthetic ssDNA containing 5-meC was incubated at an elevated temperature, treated with Udg, and then used as a template for primer extension by yeast DNA polymerase δ (yPol δ). Direct sequencing of the DNA products identified mutations with a similar spectrum to cancer signature 1, confirming its proposed biochemical etiology. In addition, C>G mutations on damaged G residues, which did not exist in signature 1, were detected in the presence of DNA polymerase ζ. The nature of the G damage has also been investigated.

Materials and methods

DNA and proteins

DNA primers and templates used in this study were synthetic DNA that were purchased from Integrated DNA Technology (templates) and Invitrogen (primers). Sequences of all synthetic DNA molecules have been previously published [21]. The structures of single-stranded DNA (ssDNA) templates (template A to G) and primers are shown in S1 Fig. Double-stranded DNA (dsDNA) templates were prepared by annealing ssDNA templates with the “top strands” [21] that were complementary to the “variable regions” (see Fig 1B and S1 Fig). To produce dsDNA with 5-meC modification at CpG sites, 2 μM of dsDNA was incubated with CpG methyltransferase (M. SssI, New England Biolabs) in the presence of 200 μM of S-adenosylmethionine, following the manufacturer’s protocol. To produce an ssDNA template containing 5-meC, 1 μM of methylated dsDNA and 10 μM of competitor DNA, which was complementary to the top strand, were incubated at 94°C for 10 sec and cooled to 37°C within 30 min. Then, the DNA was desalted by passing a G-25 spin column that was preequilibrated with H2O.

thumbnail
Fig 1. Heat-induced mutagenesis.

(A, B) Illustration of the mutation assay system. The NGS adaptors (red and green bars) are located separately on the primer and template, so that only a fully extended primer can be recognized by the NGS system. To avoid extension from the template, 3’-OH of the template was blocked by biotin. To further eliminate the template-extension product from the analysis, the primer/template hybridization region contains one mismatch (10G) to select primer-extension products during the data analysis. Unique barcodes (“BC”) on the template were used to distinguish the products of separate reactions. (C) Examples of the results. Damaged or undamaged single-stranded DNA (template A) was hybridized with the primer and extended by yPol δ, and the mutation frequencies were mapped on the template sequence. Top panel shows the background. Middle panel shows raw mutation data including background and heat-induced mutations. In the bottom panel, the damaged DNA was treated with EcUdg before the primer extension. (D-G) Single-base substitution frequencies in total of 350-nt regions were obtained under the conditions that are indicated above each graph. Damage-induced mutation frequencies were calculated by subtracting background frequencies on undamaged templates. Each panel represents the result of a single experiment that generates 350 data points, each of which corresponds to the individual base of the templates. Number of template bases are A = 88, C = 90, G = 88, and T = 84. Bars are means. (H-J) The ssDNA templates (template A-G) were treated by heat under the standard conditions except for varying pH (H), incubation period (I), or temperature (J), and then subjected to the primer extension by yPol δ. The frequencies of G>A mutations in the products (total of 350-nt regions) are shown (I and J show mean+/-SD, n = 90). In panels H-J, data points at each x-axis value are derived from a single experiment analyzing G>A frequency at 90 distinct C residues, which is considered ‘n’ value, of the templates.

https://doi.org/10.1371/journal.pone.0310601.g001

Yeast Pol δ (yPol δ: complex of Pol3-Pol31-Pol32 subunits) in which Pol32 was tagged with C-terminal His6 [22], yeast Pol ε (yPol ε: catalytic subunit) with C-terminal His6-tag [23], yeast Pol ζ (yPol ζ: complex of Rev3-Rev7-Pol31-Pol32 subunits) in which Rev7 and Pol32 were tagged with FLAG and His6, respectively [23], human Pol η (hPol η) with C-terminal His6-tag [21], human Pol κ (hPol κ) with C-terminal His6-tag [21], and human Pol ι (hPol ι) with C-terminal His6-tag [21], were purified as described in our previous papers. Concentrations of the polymerases were adjusted to 200 nM with 30 mM Tris-HCl(pH 7.5), 50 mM NaCl, 1 mM EDTA, 1 mM DTT and 5% glycerol, and stored at -80°C. E. coli Uracil DNA glycosylase (EcUdg), human AP endonuclease (hApe1), and human Smug1 (hSmug1) were purchased from New England Biolabs.

Heat-treatment of DNA

To treat seven DNA templates (template A-G) with identical conditions, they were pooled in a single microcentrifuge tube and incubated at 70˚C for 5 days on a Pekin Elmer Thermal Cycler 460. The standard DNA damaging reaction (40–100 μl) contained 100 nM DNA (total concentration of seven templates), 100 mM KCl, 10 mM MgCl2, 1 mM EDTA, and 50 mM K-hepes (pH7.4) [18], and the solution was overlayed with 100 μl of mineral oil. After the heat-damaging reaction, mineral oil was removed, and the DNA was stored at -20°C.

Glycosylase treatment

Where indicated, DNA was treated by EcUdg (0.5 units/μl), hApe1 (1 units/μl), or hSmug1 (0.5 or 1.0 units/μl) at 37°C for 30 min in the buffer that was supplied by the manufacturer. The reactions were stopped by the phenol/chloroform/isoamyl alcohol extraction, and then the DNA was precipitated with ethanol and resuspended into the desired buffer for primer extension.

Primer extension for mutation assay

The primer-extension reaction for mutation assay was carried out essentially as published [21]. In the standard reaction, equimolar mixture of the seven heat-damaged DNA templates were subjected to the primer extension from a uniquely bar-coded NGS primer. In the standard reaction (10 μl), the DNA template (0.10 pmol) and an NGS primer (0.1 pmol) were annealed by heating to 94°C for 4 sec and cooling to 37°C within 15 min, and yPol δ (0.2 pmol in 1 μl) was added to start primer extension. At this point, the reaction buffer contained 25 mM Tris-acetate (pH7.5), 50 mM NaCl, 4 mM MgCl2, 100 μg/ml BSA, 5 mM DTT, and 100 μM of each of the four dNTPs. After incubating for 30 min at 37°C, the reaction was stopped by adding 1 μl of 0.5 M EDTA, diluted 2-fold by TE buffer, and deproteinized by phenol/chloroform/isoamyl alcohol extraction. The DNA was precipitated with ethanol and resuspended into 10 μl H2O. Samples were then pooled and analyzed by an Ion Torrent GeneStudio S5 (ThermoFisher).

When a dsDNA template was used, the top strand was sequestered by competitor DNA (1 pmol), which was added to the reaction before the annealing step. When two polymerases were used in a single reaction, the DNA was incubated first with yPol δ (0.2 pmol in 1 μl) for 10 min, and then the second polymerase (0.2 pmol in 1 μl) was added and incubated for an additional 20 min. When hPol ι was used as a second polymerase, 200 μM of MnCl2 was included in the reaction.

NGS data processing

A detailed procedure for the data processing was previously published [21, 24]. The reaction products were directly analyzed by an Ion Torrent sequencer without prior PCR amplification. As described above, each primer-extension reaction contained seven templates (template A-G) and a primer that had a unique barcode. Therefore, a single reaction produced mixture of the products made on the seven templates, all of which had the same barcode. The Ion Torrent Sequencer automatically saved sequences with the same barcode into a single fastq file, which were analyzed with the Galaxy (https://usegalaxy.org/) workflow as described previously [21]. During the workflow, reads with each barcode were screened by sizes (7-nt or shorter by 13-nt than the expected error-free product) and their base call quality scores (lower than p = 0.05 for all bases), and then sorted to template A-G based on the sequence similarity (minimum 75% identity). Then, all single-nucleotide substitutions were mapped on individual bases of the templates, using the Lastz 1.3.3 sequence alignment tool [25]. Output of the Lastz analysis was processed by house-coded macro of Microsoft Excel [21]. This process first selected the read sequences containing “10G” to eliminate the sequences that were created by extension of the template, not primer. Read sequences after this point are considered “qualified reads”. Then mutation frequencies at individual bases of the templates were calculated as % of base-substitution in qualified reads. The numbers of the qualified reads (n) are shown in S1 Table.

Because the templates used in this study were synthetic DNA, and because NGS analysis always have some errors, raw mutation data must have considerable background. In addition, background level depends on the polymerase (S4 Fig). To obtain the background mutation frequencies, primer-extensions were carried out on undamaged templates, and damage-induced mutation frequencies were obtained by subtracting the backgrounds made by corresponding polymerases at individual bases of the templates. All data in this paper, expect for data in Fig 1C and S2 Fig, are obtained after subtracting the background. Trinucleotide mutation spectra were calculated by Microsoft Excel [21]. For comparison, COSMIC signatures (ver 3.2) were downloaded from https://cancer.sanger.ac.uk/cosmic/signatures. The numbers of qualified reads (n) obtained by NGS analyses are shown in S1 Table. GraphPad Prism version 9 and Microsoft Excel were used to compute statistical values. Statistical analyses of individual experiments including “n” are described in the figures and figure legends.

Results and discussion

Biochemical reconstitution of heat-induced mutational processes

To analyze the mutations by heat-induced DNA damages in vitro, synthetic ssDNA molecules (template A-G; S1 Fig) were incubated at 70°C for 5 days and used as templates for primer extension by replicative DNA polymerases yPol δ (Fig 1A). Templates and primers were designed so that only fully extended primers were recognized by an NGS system [21]. Products of separate primer extension reactions were distinguished by unique barcodes on the primers (‘BC’ in Fig 1B). Raw sequence reads were processed so that the base-substitutions were mapped on the template sequence. Typically, 10,000 to 200,000 qualified reads were obtained for each template (‘n’ in Fig 1C and S1 Table), and the mutation frequencies were calculated from the number of base substitutions and total qualified reads. Fig 1C shows some examples of the mutation frequencies on a template (template A), showing misincorporations of dAMP (red bars) at C residues of the template, causing G>A mutations. All seven templates showed similar types of mutations (S2 Fig).

The mutation frequency data of total 350-nt regions (50-nt x 7 templates, shown in S1 Fig) are summarized in Fig 1D–1G. Only G>A mutations were observed clearly above background level when replicative polymerase (yPol δ or yPol ε) were used in the reaction (Fig 1D and 1G). The G>A mutations were diminished when the damaged DNA was treated with E. coli uracil DNA glycosylase (EcUdg) before the primer extension (Fig 1E). EcUdg removes uracil from DNA, producing an abasic (AP) site. Since the replicative DNA polymerases cannot bypass AP sites [23], the reaction should not produce G>A mutations if they were caused by U residues on the templates. The G>A mutations were also diminished when dsDNA, instead of ssDNA, was incubated at 70°C (Fig 1F). All of these results support that the G>A mutations were caused by U residues, which were produced by deamination of C residues. As predicted from previous analyses of C deamination [18, 19, 26], G>A mutations accumulated linearly with incubation time and accelerated at elevated temperature and lower pH (Fig 1H–1J). The mutation frequencies observed in this study (0.5–1.0% G>A in 6 days at 70˚C at pH 7.4) is equivalent to deamination half-life of 415–830 days. This rate is roughly consistent with the previously reported values that were obtained under similar, but not identical conditions. For example, chemical assay showed 0.1–0.3% per day [7] and genetic assay showed 0.13% per day [18] of cytosine deamination at 70˚C.

Next, influences of 5-meC modification on the in vitro mutagenesis were analyzed in the presence and the absence of EcUdg (Fig 2). To make 5-meC modifications, the template DNA was treated with CpG methyltransferase (M.SssI). Since the methyltransferase is a dsDNA-specific enzyme, the methylation reaction was carried out with the dsDNA template, and then the top strand was removed by reannealing with the competitor DNA (Fig 2A, Materials and Methods for more details). The resulting ssDNA containing 5-meC was incubated at 70°C for 5 days to facilitate the deamination. Then the heat-treated DNA was incubated with or without EcUdg, and primer-extension with yPol δ was carried out. Finally, mutations on the fully extended products were quantified. To analyze the sequence contexts, frequencies of G>A mutations produced on C residues were organized by the trinucleotide context of the template (Fig 2B–2E). Here, and all other mutation spectra data in this paper, y-axis of the graphs are mutation frequencies. Thus, each blue bar represents the average G>A mutation frequency at all template sites that share the same trinucleotide contexts. In the absence of 5-meC or EcUdg, about 0.5% of G>A mutations were observed, and they were not considerably influenced by the sequence context (Fig 2B). EcUdg treatment diminished the G>A mutations on the 5-meC-free template (Fig 2C). When the 5-meC modification existed, the mutation frequencies at CpG sites were moderately increased without EcUdg treatment (Fig 2D). Side-by-side comparison showed that a 1.4 to 1.9-fold (p<0.0001) stimulation by 5-meC were observed at CpG sites, while no other site was influenced significantly (Fig 2F). This confirms previous reports showing that 5-meC is more efficiently deaminated than unmodified C, although the stimulation observed in this study is less drastic than previous reports (2-5-fold; [7, 16]). As expected, EcUdg-treatment eliminated all G>A mutations except for ones at the CpG sites if the templates had 5-meC (Fig 2E). Remaining mutations are only G>A at the CpG template context. Importantly, entire 96-dimensional mutation spectrum that is constructed under this condition (Fig 2G, 2H) is very similar to cancer signature 1 (Fig 2I), with a cosine similarity of 0.95 (S4 Fig). This provides a biochemical confirmation of the idea that spontaneous deamination of 5-meC followed by ordinary DNA replication, can cause equivalent mutations to the signature 1.

thumbnail
Fig 2. Biochemical reconstitution of cancer signature 1.

(A) Illustration of heat-induced mutagenesis on 5-meC-containing DNA. (B-E) Spectra of G>A mutations on C residues of templates with different sequence contexts. The templates were modified with 5-meC and treated with EcUdg as indicated. The control DNA without 5-meC modification was prepared by the same exact procedure, except that CpG methyltransferase was omitted. Mutations on NpCpG template contexts are indicated by triangles. Each panel represents the result of a single experiment that generated G>A mutation data at 90 C residues of the templates. Number of each triplet contexts (n) are shown below the panel F. (F) The dataset in panels B and D are reorganized to compare the G>A mutation frequencies in the presence (blue) and the absence (red) of the 5-meC modification within each trinucleotide context. ***p<0.0001 in a 2-way ANOVA with Tukey’s multiple comparisons (n values shown under the graph are the number of each triplet in the templates). All other pairs are scored “ns”. (G) Illustration to explain how in vitro mutation frequency data was organized in comparable format of cancer mutational signature. In this example, strand extension occurs from left to right on the damaged template. During the DNA synthesis, two misincorporations (G>A in blue and C>T in green) occur on the extending strand. These changes are complementary to each other including the sequence context, making them indistinguishable by genome sequencing. Since our in vitro approach separately quantifies such complementary mutations on the extending strand, we show them at the same position on the graph using a blue bar (purine substitution) and a green bar (pyrimidine substitution). (H) Spectrum of mutations that were produced by yPol δ on the ssDNA containing 5-meC, which was treated by heat and then by EcUdg. (I) Cancer mutational signature 1 (Alexandrov et. al, 2020) after normalization of trinucleotides appearance in human genome.

https://doi.org/10.1371/journal.pone.0310601.g002

Since EcUdg-treatment eliminates uracil-containing templates from analysis, it also makes mutations on 5-meC residues invisible if C and 5-meC residues are deaminated on the same DNA molecules. However less than 1% of template DNA had two G>A mutations on the same molecule without EcUdg-treatment, and the double-mutation events were roughly stochastic (S2 Table). Thus, EcUdg-treatment should not have significant impact to the mutation frequencies on 5-meC residues.

Heat-induced guanine damage induces C>G mutation in the presence of yPol ζ

It is notable that replicative polymerase yPol δ alone can produce the signature 1-like mutation spectrum on heated ssDNA in vitro. It was rather unexpected to find that no other mutations were formed under these conditions, because an elevated temperature can induce multiple types of damage on DNA and some of them may be bypassed by yPol δ. It is important to note that damages not bypassed by the replicative DNA polymerases were not converted to mutations in our system. To identify the specialized polymerases that can mediate mutagenic TLS across such damages, several TLS polymerases were added to the reaction 10 min after yPol δ and incubated for an additional 20 min (Fig 3). During the first 10 min, the yPol δ-mediated primer extension was mostly completed [23]. After addition of a TLS polymerase, additional damages might be bypassed by the TLS polymerase, which may cause additional mutations (Fig 3A). Among the TLS polymerases tested (hPol η, yPol ζ, hPol ι, and hPol κ), yPol ζ produced distinct mutations (C>G and C>A) on the G residues of the templates (Fig 3B). Especially, C>G mutations occurred remarkably high frequencies (average was 0.42%), suggesting that the damaged G residue (tentatively expressed as G#) that caused the C>G mutation was produced at a comparable frequency to the deamination of C residues. yPol ζ-dependent C>G mutation was not observed on the heat-treated dsDNA (Fig 3B, left-bottom), indicating that G# formation preferred ssDNA over dsDNA, like cytosine deamination. Other polymerases tested here did not make significant changes in the mutation spectrum, suggesting that they may not be able to bypass G# or that they may bypass it without making mutations. It has been reported that hPol η can insert dCMP at a deaminated G residue (xanthine) [27]. Such a nonmutagenic TLS might be involved in the mutation spectra produced by other polymerases.

thumbnail
Fig 3. TLS-polymerase-dependent mutagenesis.

(A) Illustration of the system. Heat-treated template contains unidentified damage “G#” that cannot be bypassed by yPol δ. TLS polymerase may be able to bypass G# by inserting a mutagenic nucleotide. (B) Spectra of heat-induced mutations that were produced on the ssDNA or dsDNA by yPol δ and indicated TLS polymerases (mean +SD). Each panel represents the result of a single experiment analyzing seven templates (template A-G).

https://doi.org/10.1371/journal.pone.0310601.g003

What is the G# modification? Two major G modifications are known to be induced by heat at a neutral pH: deamination and base-removal, which produce xanthine and AP site, respectively [19]. To explore the identity of the G#, the damaged DNA was treated with various repair enzymes (Fig 4). An AP endonuclease hApe1, which cuts an AP site into a strand break, did not change C>G mutation frequencies (Fig 4C and 4I), indicating that the G# is unlikely to be an AP site. Activity of the hApe1 can be confirmed by comparing Fig 4D and 4E. While yPol ζ could produce G>A mutations by bypassing AP sites that were produced by EcUdg, (Fig 4D, blue bars), all the G>A mutations were eliminated by hApe1 treatment, except for the ones on CpG sites (Fig 4E and 4K). This indicates that hApe1 was active in the reaction. The remaining major candidate of G#, xanthine, is cut by human Smug1 glycosylase (hSmug1), which can cut U and xanthine residues into AP sites [28]. This glycosylase reduced C>G frequencies to approximately 50% (Fig 4F and 4I), and increased amount of the enzyme did not further reduce the mutation frequencies (Fig 4G and 4I). This result suggests that G# is likely a mixture of hSmug1-sensitive and -resistant modifications, and that hSmug1-sensitive part of G# might be xanthine. However, hSmug1, like many other glycosylases, can take a wide range of substrates, which may not have been completely elucidated. In addition, there might be a heat-induced G modification that has not yet been identified. This study does not positively identify the structure of G# modification.

thumbnail
Fig 4. Mutation spectra in the presence of yPol ζ.

(A) Illustration of the system. Heat-damaged ssDNA templates containing 5-meC were incubated with various DNA repair enzymes and then used as templates of the primer extension by yPol δ and yPol ζ as in Fig 3. (B-G) Mutation spectra made by yPol δ and yPol ζ after treatment with indicated repair enzyme. Each panel represents the result of a single experiment analyzing seven templates (template A-G). Proposed mutational mechanisms are illustrated above each graph. (H-K) Effects of glycosylases on indicated mutations were obtained as the ratios of mutation frequencies in the presence/absence of the glycosylase at individual template G and C bases. Mutation data in the panels B-G were used to calculate the ratios. To avoid excessive data fluctuations, the ratios were calculated only at the bases that showed 0.1% or higher mutation frequencies in the absence of the glycosylase (bar = mean, n values shown in the Figure are numbers of bases at which mutation frequencies were calculated).

https://doi.org/10.1371/journal.pone.0310601.g004

Fig 4F and 4K show that hSmug1 also reduced G>A mutation at C residues that do not have CpG context. It is likely that these G>A mutations were produced by yPol ζ-mediated insertion of dAMP at AP sites, which were produced by hSmug1-mediated removal of U residues. Under normal conditions, a significant fraction of AP sites is spontaneously converted into strand breaks by β-elimination chemistry. That should be the reason for the 50% reduction of G>A frequencies. Consistently, it has been reported that yPol ζ can efficiently insert dAMP at AP sites [23].

As described above, cancer signature 1 can be biochemically reproducible by the mechanism that involves spontaneous deamination of 5-meC and ordinary DNA replication. Spontaneous damages other than 5-meC deamination should occur on cellular DNA, but repairability and replicability would select damages that are eventually converted to somatic mutations.

Supporting information

S1 Fig. Sequences of synthetic ssDNA templates (template A-G) and a NGS primer.

https://doi.org/10.1371/journal.pone.0310601.s001

(PDF)

S2 Fig. The same analyses as in Fig 1C were applied to the templates that were shown in S1 Fig.

https://doi.org/10.1371/journal.pone.0310601.s002

(PDF)

S3 Fig. Cosine similarity between the in vitro mutation spectrum (shown in Fig 2H) and normalized cancer mutational signatures.

https://doi.org/10.1371/journal.pone.0310601.s003

(PDF)

S4 Fig. Background spectra that were produced by indicated DNA polymerases.

https://doi.org/10.1371/journal.pone.0310601.s004

(PDF)

S2 Table. Numbers of templates that have G>A mutations.

https://doi.org/10.1371/journal.pone.0310601.s006

(XLSX)

Acknowledgments

I thank Mahima Sanyal, Mya Crestwel, Elaina Grube, Eden Kenner, Noriko Kantake (Ohio University) and Sakura Sugiyama (The Ohio State University) for comments on the manuscript.

References

  1. 1. Lindahl T. Instability and decay of the primary structure of DNA. Nature. 1993;362(6422):709–15. pmid:8469282.
  2. 2. Pfeifer GP. Mutagenesis at methylated CpG sequences. Curr Top Microbiol Immunol. 2006;301:259–81. Epub 2006/03/31. pmid:16570852.
  3. 3. Riggs AD, Jones PA. 5-methylcytosine, gene regulation, and cancer. Adv Cancer Res. 1983;40:1–30. Epub 1983/01/01. pmid:6197868.
  4. 4. Moen EL, Mariani CJ, Zullow H, Jeff-Eke M, Litwin E, Nikitas JN, et al. New themes in the biological functions of 5-methylcytosine and 5-hydroxymethylcytosine. Immunol Rev. 2015;263(1):36–49. Epub 2014/12/17. pmid:25510270; PubMed Central PMCID: PMC6691509.
  5. 5. Neddermann P, Gallinari P, Lettieri T, Schmid D, Truong O, Hsuan JJ, et al. Cloning and expression of human G/T mismatch-specific thymine-DNA glycosylase. J Biol Chem. 1996;271(22):12767–74. pmid:8662714.
  6. 6. Hendrich B, Hardeland U, Ng HH, Jiricny J, Bird A. The thymine glycosylase MBD4 can bind to the product of deamination at methylated CpG sites. Nature. 1999;401(6750):301–4. Epub 1999/09/28. pmid:10499592.
  7. 7. Ehrlich M, Norris KF, Wang RY, Kuo KC, Gehrke CW. DNA cytosine methylation and heat-induced deamination. Bioscience reports. 1986;6(4):387–93. pmid:3527293.
  8. 8. O’Neill JP, Finette BA. Transition mutations at CpG dinucleotides are the most frequent in vivo spontaneous single-based substitution mutation in the human HPRT gene. Environmental and molecular mutagenesis. 1998;32(2):188–91. pmid:9776183.
  9. 9. Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SA, Behjati S, Biankin AV, et al. Signatures of mutational processes in human cancer. Nature. 2013;500(7463):415–21. pmid:23945592; PubMed Central PMCID: PMC3776390.
  10. 10. Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW, Wu Y, et al. The repertoire of mutational signatures in human cancer. Nature. 2020;578(7793):94–101. pmid:32025018; PubMed Central PMCID: PMC7054213.
  11. 11. Alexandrov LB, Jones PH, Wedge DC, Sale JE, Campbell PJ, Nik-Zainal S, Stratton MR. Clock-like mutational processes in human somatic cells. Nat Genet. 2015;47(12):1402–7. pmid:26551669; PubMed Central PMCID: PMC4783858.
  12. 12. Zhang L, Dong X, Lee M, Maslov AY, Wang T, Vijg J. Single-cell whole-genome sequencing reveals the functional landscape of somatic mutations in B lymphocytes across the human lifespan. Proc Natl Acad Sci U S A. 2019;116(18):9014–9. pmid:30992375; PubMed Central PMCID: PMC6500118.
  13. 13. Moore L, Leongamornlert D, Coorens THH, Sanders MA, Ellis P, Dentro SC, et al. The mutational landscape of normal human endometrial epithelium. Nature. 2020;580(7805):640–6. pmid:32350471.
  14. 14. Shapiro R, Klein RS. The deamination of cytidine and cytosine by acidic buffer solutions. Mutagenic implications. Biochemistry. 1966;5(7):2358–62. Epub 1966/07/01. pmid:5959459.
  15. 15. Baltz RH, Bingham PM, Drake JW. Heat mutagenesis in bacteriophage T4: the transition pathway. Proc Natl Acad Sci U S A. 1976;73(4):1269–73. Epub 1976/04/01. PubMed Central PMCID: PMC430244. pmid:4797
  16. 16. Lindahl T, Nyberg B. Heat-induced deamination of cytosine residues in deoxyribonucleic acid. Biochemistry. 1974;13(16):3405–10. Epub 1974/07/30. pmid:4601435.
  17. 17. Shen JC, Rideout WM 3rd, Jones PA. The rate of hydrolytic deamination of 5-methylcytosine in double-stranded DNA. Nucleic Acids Res. 1994;22(6):972–6. pmid:8152929; PubMed Central PMCID: PMC307917.
  18. 18. Frederico LA, Kunkel TA, Shaw BR. A sensitive genetic assay for the detection of cytosine deamination: determination of rate constants and the activation energy. Biochemistry. 1990;29(10):2532–7. pmid:2185829.
  19. 19. Schroeder GK, Wolfenden R. Rates of spontaneous disintegration of DNA and the rate enhancements produced by DNA glycosylases and deaminases. Biochemistry. 2007;46(47):13638–47. Epub 2007/11/02. pmid:17973496.
  20. 20. Lewis CA Jr., Crayle J, Zhou S, Swanstrom R, Wolfenden R. Cytosine deamination and the precipitous decline of spontaneous mutation during Earth’s history. Proc Natl Acad Sci U S A. 2016;113(29):8194–9. pmid:27382162; PubMed Central PMCID: PMC4961170.
  21. 21. Sugiyama T, Chen Y. Biochemical reconstitution of UV-induced mutational processes. Nucleic Acids Res. 2019;47(13):6769–82. pmid:31053851; PubMed Central PMCID: PMC6648339.
  22. 22. Li J, Holzschu DL, Sugiyama T. PCNA is efficiently loaded on the DNA recombination intermediate to modulate polymerase delta, eta, and zeta activities. Proc Natl Acad Sci U S A. 2013;110(19):7672–7. PubMed Central PMCID: PMC3651489; pmid:23610416.
  23. 23. Chen Y, Sugiyama T. NGS-based analysis of base-substitution signatures created by yeast DNA polymerase eta and zeta on undamaged and abasic DNA templates in vitro. DNA Repair (Amst). 2017;59:34–43. pmid:28946034; PubMed Central PMCID: PMC5643249.
  24. 24. Sugiyama T, Sanyal MR. Biochemical analysis of H(2)O(2)-induced mutation spectra revealed that multiple damages were involved in the mutational process. DNA Repair (Amst). 2023;134:103617. Epub 20231222. pmid:38154332; PubMed Central PMCID: PMC10842480.
  25. 25. Harris RS. IMPROVED PAIRWISE ALIGNMENT OF GENOMIC DNA. PhD Thesis, The Pennsylvania State University. 2007.
  26. 26. Shapiro R, Danzig M. Acidic hydrolysis of deoxycytidine and deoxyuridine derivatives. The general mechanism of deoxyribonucleoside hydrolysis. Biochemistry. 1972;11(1):23–9. Epub 1972/01/04. pmid:5009434.
  27. 27. Jung H, Hawkins M, Lee S. Structural insights into the bypass of the major deaminated purines by translesion synthesis DNA polymerase. Biochem J. 2020;477(24):4797–810. Epub 2020/12/02. pmid:33258913; PubMed Central PMCID: PMC8138886.
  28. 28. Mi R, Dong L, Kaulgud T, Hackett KW, Dominy BN, Cao W. Insights from xanthine and uracil DNA glycosylase activities of bacterial and human SMUG1: switching SMUG1 to UDG. J Mol Biol. 2009;385(3):761–78. Epub 2008/10/07. pmid:18835277.