Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Semi-rational evolution of a recombinant DNA polymerase for modified nucleotide incorporation efficiency

Abstract

Engineering improved B-family DNA polymerases to catalyze 3′-O-modified nucleotide reversible terminators is limited by an insufficient understanding of the structural determinants that define polymerization efficiency. To explore the key mechanism for unnatural nucleotide incorporation, we engineered a B-family DNA polymerase from Thermococcus Kodakaraenis (KOD pol) by using semi-rational design strategies. We first scanned the active pocket of KOD pol through site-directed saturation mutagenesis and combinatorial mutations and identified a variant Mut_C2 containing five mutation sites (D141A, E143A, L408I, Y409A, A485E) using a high-throughput microwell-based screening method. Mut_C2 demonstrated high catalytic efficiency in incorporating 3’-O-azidomethyl-dATP labeled with a Cy3 dye, whereas the wild-type KOD pol failed to catalyze it. Computational simulations were then conducted of the DNA binding region of KOD pol to predict additional mutations with enhanced catalytic activity, which were subsequently experimentally verified. By a stepwise combinatorial mutagenesis approach, we obtained an eleven-mutation variant, named Mut_E10 by introducing additional mutations to the Mut_C2 variant. Mut_E10, which carried six specific mutations (S383T, Y384F, V389I, V589H, T676K, and V680M) within the DNA-binding region, demonstrated over 20-fold improvement in enzymatic activity as compared to Mut_C2. In addition, Mut_E10 demonstrated satisfactory performance in two different sequencing platforms (BGISEQ-500 and MGISEQ-2000), indicating its potential for commercialization. Our study demonstrates that a significant enhancement in its catalytic efficiency towards modified nucleotides can be achieved efficiently through combinatorial mutagenesis of residues in the active site and DNA binding region of DNA polymerases. These findings contribute to a comprehensive understanding of the mechanisms that underlie the incorporation of modified nucleotides by DNA polymerase. The sites of beneficial mutations, as well as the nucleotide incorporation mechanism identified in this study, can provide valuable guidance for the engineering of other B-family DNA polymerases.

Introduction

DNA polymerases discovered in nature play important roles in molecular biology, synthetic biology, and molecular diagnostics [1, 2]. Wild-type (WT) DNA polymerases mostly fail or perform poorly in practical applications [2]. However, these shortcomings can be overcome by enzyme engineering [3]. The application of DNA polymerase engineering to improve the performance of natural enzymes has been widely reported [1, 4]. Engineered DNA polymerases are the workhorses of numerous biotechnological applications such as in the fields of DNA synthesis [4], molecular diagnostics [5], and DNA sequencing [3, 6, 7].

Many DNA sequencing methods rely on engineered DNA polymerases that efficiently catalyze incorporation of chemically modified nucleotides [8, 9] such as the Sanger sequencing method, which relies on dideoxynucleoside triphosphates (ddNTPs) [7], and the sequencing-by-synthesis (SBS) method, which uses dye-labeled reversible terminators in an iterative loop [6]. SBS relies on the identification of each base as the DNA strand is extended by the cleavable fluorescent nucleotide reversible terminators that temporarily pause the DNA synthesis for sequence determination [10]. Challenges in using SBS with cleavable fluorescent nucleotide reversible terminators involve further improving the DNA polymerase such that it efficiently recognizes and accepts a broad range of substrates [9]. Archaeal B-family DNA polymerases are widely engineered to be applied in catalyzing the addition of chemically modified nucleotide substrates in SBS sequencing, because of their potential for tunability of the active site and acceptance of a broad substrate range [1113]. The engineering of B-family DNA polymerases relies on the development of various enzyme evolution techniques [14].

Enzyme engineering techniques usually include computational design [15, 16], semi-rational design [17], and large-scale screening by directed evolution [18]. Semi-rational design is one of the most widely applied approaches, which enables targeted exploration of functional amino acid sites by selecting and mutating specific residues, with reduced library size and screening costs compared to random mutagenesis [14]. Semi-rational design is typically complemented by various screening methods to evaluate and select protein variants with desired characteristics [17]. Enzyme screening strategies include DNA shuffling [18], CSR (Compartmentalized Self-Replication) [19], microfluidic screening technology [20], microplate reader screening technology [21], etc. Most screening techniques will utilize fluorescence as a readout because of its high sensitivity, time efficiency, the potential for high throughput, low cost, and other advantages [21]. Förster resonance energy transfer (FRET) is a mechanism describing energy transfer between two fluorescent molecules [22], which is widely exploited in protein mechanistic and engineering studies, such as studies of protein folding kinetics [23], exploring the spatial relationships between protein and substrates [24], and protein-protein interactions [24]. In particular, previous studies have reported the applicability of FRET technology for the screening of enzymatic activity [25, 26].

The KOD DNA polymerase (KOD pol) from Thermococcus kodakarensis is an important representative of B-family DNA polymerases [27, 28]. Engineered KOD pol is the key component of numerous biotechnological applications such as multiple types of PCR [29], TNA synthesis [30], and base or sugar-modified nucleoside triphosphates incorporation [27, 31, 32]. Some KOD variants have been reported to tolerate and polymerize certain unnatural nucleotides [4]. For example, the KOD variant (exo-) can tolerate 7-deaza-modified adenosine triphosphate [31] or microenvironment-sensitive fluorescent nucleotide probes [32]. KOD variant (exo-, A485L) can catalyze 5-substituted pyrimidine nucleoside triphosphates (dNamTPs) [33]. KOD variant (P179S, L650R) can tolerate LNA nucleotides additionally modified at the 3’ position of the sugar moiety [34]. However, significantly less work has been reported on the engineering and modification of B family DNA polymerases to enable the efficient incorporation of cleavable fluorescent nucleotide reversible terminators for Next-Generation Sequencing (NGS) applications. A previous study reported that the 9°N DNA polymerase variant (exo-, A485L, Y409V) can catalyze nucleotide reversible terminators, such as 3’-O-azidomethyl-dNTPs [9]. KOD DNA polymerase and 9°N DNA polymerase share high sequence similarity (91%), and both belong to the B-family polymerases [35]. Thus, KOD DNA polymerase has the potential to be engineered to efficiently catalyze reversible terminators, thereby expanding its applications in the field of DNA sequencing [35]. In engineering KOD DNA polymerases to use modified nucleotides as substrates, there are generally two main directions of sequence modification: mutations in the active site to enable discrimination and incorporation of specific nucleotides [36], and mutations in the DNA strand binding region to enhance catalytic efficiency by stabilizing the interactions between the polymerase, the DNA template, and the incoming nucleotide [37].

In this study, we established a FRET-based enzyme screening platform to screen enzyme variants for their improved abilities to use 3’-O-azidomethyl-dATP labeled with a Cy3 dye as substrate to extend a primer-templated and Cy5-labeled DNA using the FRET emission signal of Cy5 as an indicator. Initially, we scanned amino acids located in the active site pocket of KOD pol. Subsequently, we computationally predicted relevant residues to be mutated within the DNA binding region of KOD pol. By employing enzyme kinetic screening of multiple mutant libraries, we successfully obtained a KOD variant that exhibited satisfactory performance on both BGI and MGI sequencing platforms. Through dissecting the specific residues within the active site and DNA binding region, this study provides insights into the key mechanisms underlying the incorporation of modified nucleotides by KOD pol. The effectiveness of semi-rational design strategies and the identified beneficial mutation sites in this study serve as valuable guidance for the engineering of other B-family DNA polymerases. In addition, the significance of our research lies in its implications for the engineering of DNA polymerases and their applications in NGS.

Results

Mutant screening strategy establishment

The comprehensive screening strategy developed for KOD pol involved multiple intricate steps including structural analysis, MD simulations, protein expression, variant screening, protein variant purification, and evaluations in sequencing applications, as illustrated in Fig 1A. The process began with a structural analysis of KOD pol to provide valuable insights into potential mutagenesis sites. We performed saturation and combinatorial mutagenesis on the selected sites to construct multiple libraries. MD simulations were then employed to predict additional promising sites to do stepwise combinatorial mutagenesis. Afterwards, the high-throughput expression of KOD pol variants was carried out in a 96-deep-well plate format in order to obtain sufficient quantities of protein for screening. The expressed proteins were subjected to semi-purification followed by quantification to ensure the reliability and comparability of the subsequent mutant screening experiments (S1A and S1B Fig in S1 File). Then, we evaluated the enzyme activity and kinetic performance of the protein variants in a 384-well plate format using a microplate reader. After identifying variants with favorable kinetic characteristics, we performed rigorous purification using chromatography with three prepacked columns to eliminate potential interference from impurities on sequencing quality. Subsequently, the promising and highly purified protein variants were tested for their sequencing performance on the BGISEQ-500 platform for SE50 testing and the MGISEQ-2000 platform for PE100 testing.

thumbnail
Fig 1. Schematic of KOD pol engineering experiments.

A) Overview of the semi-rational evolution process involving structural analysis of KOD pol, MD simulation, library construction, protein expression, variant screening, variant purification, and sequencing application tests. Multiple libraries were constructed by site-directed mutagenesis, combinatorial mutagenesis, and stepwise combinatorial mutagenesis. The screening of the activity of the mutants was performed by monitoring FRET (Cy5 fluorescence emission) as an indicator. B) The molecular structures of 3’-O-azidomethyl-deoxyadenosine triphosphate with dye Cy3 labeled (modified dATP) as well as native dATP. C) Schematic illustration of how catalytic efficiency of the variants can be indicated by FRET Cy5 emission signal. The FRET Cy5 fluorescence emission signal can be detected at 676 nm upon excitation of the incorporated Cy3 dye-labeled modified dAMP at 530 nm. If the modified dAMP is not properly incorporated into the P/T-2Cy5, no FRET Cy5 fluorescence emission signal can be detected. DNA polymerase is shown in cartoon format and colored grey. Dye Cy5 and Cy3 are shown in cartoon format and colored blue and pink, respectively. The primer-template duplex is labeled with Cy5 dye at both 5’ ends, which enhances signal intensity.

https://doi.org/10.1371/journal.pone.0316531.g001

In the screening steps, the modified dATP, 3’-O-azidomethyl-dATP labeled with a Cy3 dye, shown in Fig 1B, was employed as substrate. Additionally, we employed a primer-template complex in which the single oligo strand was labeled with a fluorescent Cy5 dye at its 5’ end (P/T-2Cy5), as illustrated in Fig 1C. Upon successful incorporation of the substrate into P/T-2Cy5 by DNA polymerase, Cy3 excitation at 530 nm induces fluorescence resonance energy transfer to the nearby Cy5 molecule, resulting in the Cy5 emission at 676 nm, as illustrated in Fig 1C. The increase in the Cy5 FRET emission signal can be conveniently monitored with a microplate reader, allowing for the measurement of the incorporation rate of modified nucleotide into the DNA strand. By comparing the incorporation rates (RFUs/min, with RFUs representing the increased Cy5 FRET emission signal), we were able to determine the relative catalytic efficiencies of KOD variants.

Our screening method was validated through assessing the fluorescent signals of two KOD variants with distinct enzymatic activities and WT KOD pol, shown in S1C Fig in S1 File. The developed screening method was effective in distinguishing between the different levels of enzymatic activities of these variants. We note that the dye-labeled substrates employed were at risk of photobleaching during storage and usage, which raises concerns regarding the reliability and reproducibility of data, particularly when prolonged storage or multiple freeze-thaw cycles are necessary. To minimize the influence of photobleaching, we chose to use the direct readout of the fluorescence intensity values in each batch tested, and not to convert these into absolute converted substrate concentrations. A positive control (the parent mutant) was employed in each experiment as an internal reference for all the other tested mutants. By using this internal reference, the efficiencies of variants tested in different rounds of experiments can be compared.

Selection of first-round mutation sites and mutant library construction

First, we analyzed the amino acids located in the active pocket of KOD pol. The crystal structures of KOD pol, including the open (PDB ID: 4K8Z) [38] and the closed (PDB ID:5OMF) [35] states, are superimposed and shown in Fig 2A. A significant movement of the finger domain from the open state to the closed state can be observed, resulting in the formation of the active pocket by the palm and closed finger subdomains. Residues located in the active pocket of KOD pol directly impact nucleotide incorporation efficiency by interfacing with incoming nucleotides. In the active pocket, residues L408,Y409, P410, and A485 (Fig 2B) are associated with enhanced unnatural nucleotide incorporation efficiency [39]. In addition, in the engineering of B-family polymerases to use modified nucleotides as substrates, exonuclease activity is often inactivated to prevent the excision of the incorporated modified nucleotides [13]. A study reported that an exonuclease-deficient variant (D141A and E143A) with A485L mutation derived from Thermococcus kodakaraensis (KOD) DNA polymerase was able to use unnatural nucleotides to extend the primer/template DNA [33]. Based on these findings, we selected residues L408, Y409, P410, and A485 as key sites for the first-round mutagenesis, on the basis of the D141A and E143A mutations. Despite the fact that these sites have been previously reported to be important for the objectives of our study, there is still a lack of in-depth investigation into the effects of their combination and interactions with other related sites. Such an in-depth investigation will be valuable for the engineering of B-family DNA polymerases to improve the incorporation efficiency for unnatural nucleotides.

thumbnail
Fig 2. Illustration of structures of KOD pol and demonstration of library construction.

A) Depiction of the superimposed structures of a ternary KOD complex (closed state, PDB ID: 5MOF) and a binary KOD complex (open state, PDB ID: 4K8Z), which are shown in cartoon format. The finger domain of the ternary complex is colored yellow and closed by an inward movement of approximately 24°, in comparison to the finger domain of the binary complex colored green. The additional domains of the KOD ternary complex are colored in gray, the exonuclease of the binary complex in light pink, the N-terminal of the binary complex in pink, the thumb domain of the binary complex in light blue, the palm domain of the binary complex in cyan. The primer-template duplex of the KOD ternary and binary complex are colored in gray and orange, respectively. B) Illustration of the active pocket of KOD pol with position A485 colored in blue, L408 colored in cyan, Y409 colored in pink, and P410 colored in green. All four residues are shown as sticks. The dATP is shown as sticks and depicted as orange elements, and the rest of the protein is shown in cartoon format and colored in gray. C) Illustration of the first-round constructed library, which includes sites saturation mutations at positions 408, 409, 410, or 485. The plasmid is a derivative of pD441, a high copy number vector originally for the expression of E. coli flagellin from an IPTG-inducible T5 promoter.

https://doi.org/10.1371/journal.pone.0316531.g002

We performed alanine substitutions at residues L408, Y409, P410, and A485, which allows for the evaluation of the individual contributions of each amino acid sidechain to the overall functionality of the protein [40, 41]. In this study, alanine substitution in the active pocket of KOD pol could create favorable conditions for the incorporation of modified dNTP carrying sterically demanding groups, such as a flexible linker and fluorescence dye. Therefore, we designed a variant of KOD pol called Mut_1 (D141A, E143A, L408A, Y409A, P410A), as the parent variant in the first round of mutagenesis. Notably, position 485 in the wild-type KOD pol is already an A, so no further mutation was introduced at this site in Mut_1. Next, we constructed the first-round library based on Mut_1, which involved site saturation mutagenesis at each position of 408, 409, 410, or 485(Fig 2C). These libraries were subjected to enzyme activity screening to investigate the influence of different side chains of key sites on the catalytic efficiency, which laid the groundwork for subsequent combinatorial mutagenesis.

Mutant screening of the first-round of mutagenesis and the combinatorial (second-round) mutagenesis

The screening results of the first-round variant library are presented in Fig 3. The data are shown as relative reaction rate, Vrel as a measurement for the enzyme activity of KOD variants. Vrel of KOD variants was calculated with Eq 1: (1) V|Mut_1 represents the incorporation rate V (RFUs/min) of Mut_1, while V|Mut represents the incorporation rate V (RFUs/min) of the given variant. The parent variant Mut_1 had a Vrel value of 1, as shown in the top center in Fig 3A. A Vrel value of 0 meanss that the variant under study is unable to catalyze the modified dATP or exhibit any measurable enzymatic activity.

thumbnail
Fig 3. Summary of the screening results of site saturation mutagenesis at positions 408, 409, 410, and 485 of KOD pol.

A) The screening results of amino acid substitutions at positions 408, 409, and 410, are depicted as Vrel values. The bar charts show the screening results of saturation mutagenesis in positions 408 (gray), 409 (dark green), and 410 (red). The upper panels show locations of 408 (cyan), 409 (pink), and 410 (green) amino acids as well as 3’-O-azidomethyl-dATP (orange) in the crystal structure of KOD pol (PDB ID: 5MOF). The original dATP was replaced with 3’-O-azidomethyl-dATP, which was positioned using homology modeling. L408I/A, Y409A, P410A, and 3’-O-azidomethyl-dATP are shown as sticks. The rest of the protein structure is shown in cartoon format and colored grey, and the DNA strand is shown as cartoon and colored in wheat. B) The screening results of saturation mutagenesis at position 485, as depicted by Vrel values. The blue bar represents the A485L substitution, with the corresponding structure shown above. The purple bar represents the A485E substitution, with the corresponding structure shown above with a predicted interaction with Q332 (cyan). The green bar represents the original A485 residue and its corresponding structure. All panels of protein structure were prepared using the PyMOL software.

https://doi.org/10.1371/journal.pone.0316531.g003

At position 408, eight mutations (A/I/Y/M/P/V/S/C) were identified with observable incorporation activity of modified dATP, as shown in Fig 3A (left). The mutation with the highest Vrel value, L408I, is depicted in the top left of Fig 3A in the active site structure. Only two mutations at position 409, A and G, presented enzymatic activity, with the Vrel of 409A (Fig 3A, center) higher than that of 409G. As, smaller amino acids (alanine and glycine) displayed the strongest positive effects at site 409, this site may take a potential role as a steric gatekeeper in KOD pol. Residues that function as steric gatekeepers in the B-family DNA polymerase family are typically highly conserved, for example, the conserved steric gate residue in 9°N DNA polymerase is also Y409 [42, 43]. At position 410, nine mutations (I/C/S/V/P/L/M/G/A) exhibited an increased value of Vrel (Fig 3A right), including the original residue of the WT enzyme, P410 (Fig 3A above right), which exhibited the best performance in incorporating modified dATP. Most variants at position 485 exhibited an increased Vrel compared to Mut_1, as shown in Fig 3B. Among these, six mutants, including F/I/Y/L/K/E, displayed values of Vrel more than three times higher than that of Mut_1, with E exhibiting the highest enzymatic activity. Residues L/E/A at position 485 are shown in the active site structure in the top of Fig 3B. A previous study highlightedA485L as an essential mutation for nucleotide analog discrimination [44]. We found that A485E presented satisfactory performance in modified dATP incorporation, possibly due to the additional mutations included in motif A (L408A, Y409A, P410A) in our study compared to previously reported polymerase variants [33]. In addition, the predicted inter-subdomain contacts between A485E and Q332 (as illustrated in Fig 3B, top center) may provide additional support for conformational changes in the finger domain, allowing for a rapid transition from a closed state to an open state during the polymerization process.

By site saturation mutation screening of positions 408, 409, 410, and 485, several variants with observable catalytic activity were obtained. Further investigation by combinational mutagenesis is necessary to explore whether there exists any potential synergy for the incorporation efficiency of modified nucleotides between those variants.

Mutants with increased Vrel values at positions 408/409/410/485, were selected for the second round of combinatorial mutant library construction, as shown in Fig 4A. The screening results of thirty combined mutants are displayed in Fig 4B. Eight mutants showed Vrel increase of more than 8-fold compared to Mut_1 (Fig 4C), among which Mut_C2 (D141A, E143A, L408I, Y409A, A485E) exhibited the highest increase in Vrel, which was over 16-fold compared to Mut_1. In addition, most of those combinatorial mutants exhibited higher catalytic efficiency compared to single-point mutants, which indicates the synergetic effect between 408/409/410/485 positions for the incorporation efficiency of modified nucleotides.

thumbnail
Fig 4. Screening results of combined mutagenesis at positions 408, 409, 410, and 485 of KOD pol.

A) Schematic diagram of the combinatorial mutant library construction. The library was constructed by site-directed mutagenesis, combinatorial mutagenesis, and degenerate codon mutagenesis. Amino acid substitutions include Y/I/A/P/N at position 408, A/G at position 409, P/A/I/G/V/C at position 410, and E/K/L/F/Y at position 485. B) The results of screening thirty combined mutants from the library are shown in bar plot with their Vrel values. Mut_C2 (in dark green) showed the highest value in Vrel, an over 16-fold increase compared to Mut_1. The other mutants are colored gray. C) A table of variants with > 8-fold increase in Vrel compared to Mut_1. The Vrel values represent the means±S.D. of at least two independent measurements. D) SDS-PAGE analysis (12% gel) of WT KOD pol (S1) and KOD variant Mut_C2 (S2), respectively. The amount of protein loaded was around 1.0 μg, and the purification method involved lysing cells with lysozyme at 37°C for 10 min and lysate centrifugation after heating at 80°C for 30 min. The gel was run at 120V for 1–2 h at room temperature. E) and F) The kinetic test results of Mut_C2 and WT KOD pol towards variations of P/T-2Cy5 (E) and modified dATP (F) under the otherwise same conditions. The experimental data were non-linearly fitted with the Michaelis-Menten equation with GraphPad Prism 5, and kcat and Km values of Mut_C2 are presented in the table in a format of means ± S.D. of 3 replicates.

https://doi.org/10.1371/journal.pone.0316531.g004

In order to further assess the catalytic efficiency of Mut_C2, tailored kinetic assays were developed using P/T-2Cy5 and 3’-O-azidomethyl-dATP labeled with a Cy3 dye (modified dATP), separately. These kinetic assays were conducted with different concentrations of either P/T-2Cy5 or modified dATP ranging from 0 to 6 μM while maintaining the other reagent at a constant concentration of 2 μM. The enzyme kinetic parameters kcat and Km were determined through nonlinear regression analysis of the Michaelis-Menten equation. The kinetic results of Mut_C2 are shown in Fig 4E and 4F. The ratio of kcat/Km for the variation of P/T-2Cy5 was 1263.3 RFUs·min−1, and the ratio of kcat/Km for the variation of modified dATP was 887.2 RFUs·min−1. Notably, Mut_C2 exhibited similarly high kinetic parameters in the two types of substrate variations, indicating an improved efficiency in incorporating modified dATP into the DNA strand. The kinetic analysis of WT KOD pol yields only a flat line (Fig 4E and 4F), indicating its failure in catalyzing modified dATP, as expected. Mut_C2 was selected as the parent sequence for the next round of mutagenesis.

Following the evaluation of residues situated in the catalytic active center, our next step was to conduct a further assessment with a focus on the critical residues present in the DNA strand binding region of KOD pol.

Computational screening

The identification of key residues involved in DNA binding can facilitate subsequent enhancements in DNA polymerase catalytic activity [45]. To achieve the goal of identifying key residues involved in the DNA strand binding region of KOD pol, we selected 93 specific amino acid residues for further analysis, most of them located within a 4 Å distance from the DNA strand in the crystal structure of KOD pol (PDB ID: 5MOF) (Fig 5A). These residues were chosen based on their potential to interact with the DNA through hydrogen bonds, salt bridges, and other types of interactions essential for DNA replication. The selected residues are located across key domains of KOD pol, including thumb, finger, and palm regions, which play essential roles in DNA replication (Fig 5A). Previous studies reported that the binding process between DNA polymerase and the DNA strand with the correct dNTP pairing is slower than the chemical incorporation process [46]. Raper et al. proposed that the nucleotide binding and incorporation step should be much faster than the binding equilibration of a polymerase and DNA (E + DNA ⇌ E·DNA) [45], indicating the binding or dissociation of DNA polymerase and DNA strands may be a rate-limiting step in the whole process. Therefore, mutations at these selected positions will likely change the binding affinity between KOD pol and DNA strands, either increasing or decreasing the binding rate, which could in turn affect the overall efficiency of KOD pol for polymerizing unnatural nucleotides.

thumbnail
Fig 5. Computational screening results of residues involved in polymerase binding to dsDNA.

A) The location of the 93 specific amino acid residues, most of them located within a 4 Å distance from the DNA strand in the crystal structure of KOD pol (PDB ID: 5MOF). The corresponding residues are shown as sticks and colored accordingly on the structure: blue for the Exo domain, purple for N-term, green for finger, pink for palm, and orange for thumb subdomain. The rest of the protein structure is shown in cartoon format and colored grey. The DNA strand is shown in cartoon format and colored cyan. The right panel displays a table presenting all residues selected for MD simulation. B) The average binding free energy value of the 93 residues calculated by MM-GBSA, with each region color-coded accordingly. The residues within the black dashed box were predicted to exhibit stronger binding affinity. C) Heatmap of the binding free energy of 26 distinct mutation sites selected from panel B (dashed box).

https://doi.org/10.1371/journal.pone.0316531.g005

Virtual saturation mutagenesis was performed for the selected 93 residues using molecular dynamics simulation and the binding energy with dsDNA was predicted for each mutant. The workflow of simulation and calculation by MM-GBSA is presented in S2 and S3 Figs in S1 File. We developed an automatic processing pipeline in Python to perform this virtual screening, including mutant pretreatment, molecular dynamics simulation, and binding energy calculation (S2 Fig in S1 File). The average binding energy value of saturation mutations of the 93 residues, obtained from the MM-GBSA method, is plotted in Fig 5B and listed in S1 Table in S1 File. Residues with an average computed binding energy < -260 kcal/mol, which was lower than that of WT KOD pol, were selected for further investigation (Fig 5B, dashed box). Some interesting positions previously identified by Kropp et al [13], including residues 349, 383, 389, 541, 592, and 606, were also included. The obtained next-round library consisted of 26 distinct mutation sites, and their computed binding energy were visualized by a heat map in Fig 5C. Variants with more negative computational free energy values are expected to exhibit stronger binding affinities to dsDNA. Finally, we selected the top two variants at each site with the most negative binding free energies at each site for further experimental screening.

Variant screening of the third-round mutagenesis

The third-round library containing of 52 mutants from the 26 critical amino acids identified in virtual screening was constructed by site-direct mutagenesis. Mut_C2 (D141A, E143A, L408I, Y409A, A485E) which exhibited the best performance in the second-round screening, was employed as the parent for library construction in this round. The enzymatic activities of these variants were quantified using Eq 2, which is similar to Eq 1 except that Mut_C2 serves as the reference. The Vrel of Mut_C2 was therefore set to 1.

(2)

The kinetic performance of KOD variants is described by the Michaelis-Menten parameters kcat and Km. The relative kinetic performances of those variants are characterized by the ratio of kcat/Km|Mut relative to that of Mut_C2. kcat/Km|Mut and kcat/Km|Mut_C2 were measured and calculated in the same way, with varying P/T-2Cy5 ranging from 0 to 6 μM while maintaining the modified dATP at a constant concentration of 2 μM. We define the relative kinetic performance of the mutant variants as E(Mut), as shown in Eq 3. The subscript after the vertical bar "|" indicates the variant used for measurement. The Mut_C2 variant served as the reference, with a catalytic efficiency (E(Mut)) value of 1.

(3)

The enzymatic activities of the 52 variants were screened under the same experimental conditions as described above, and their Vrel values were calculated and compared (Fig 6A). 14 variants displayed a Vrel value greater than that of Mut_C2, with one variant, Mut_D34 (V589H), exhibiting an over 2-fold increase in Vrel compared to Mut_C2. We selected 39 mutant variants, with a Vrel value higher than 0.5, for further kinetic screening measurements. Although some mutant variants did not show significantly higher enzyme activity than Mut_C2, we still selected them because of their close location to the DNA strand in the structure of KOD pol.

thumbnail
Fig 6. Screening results of third-round variants constructed based on Mut_C2.

A) The enzyme activity screening results of these variants. The Vrel (black square) of these variants was calculated according to Eq 2. The variants with a Vrel > 0.5 are indicated inside the dashed box. B) The relative kinetic screening results of those variants with a Vrel >0.5. The kinetic parameters of the Michaelis-Menten equation, such as kcat and Km, were calculated by analyzing the data using GraphPad Prism 5 software. E(Mut) of these mutants was calculated according to Eq 3. Mut_D34 corresponds to the red column, while the other variants are shown in grey. The table embeded collects the site information, location, and kinetic results of variants with E(Mut) > 1. The E(Mut) values represent the means±S.D. of at least two independent measurements, whereas those E(Mut) without a S.D. represent an individual measurement. C) The location of residue 589 with V/H/Q placed in the crystal structure of KOD pol (PDB ID: 5MOF). The DNA strand is depicted in a cartoon format, with the backbone colored orange, while one of the deoxyribonucleotides is shown as sticks and colored by element. The residues at position 589 are displayed as sticks and colored green for V (in the left panel), magenta for H (in the middle panel), and cyan for Q (in the right panel). Atom contacts identified between residues (V, H, Q) at position 589 and the DNA strand are illustrated. The distances predicted by the PyMOL software are represented by dashed yellow lines.

https://doi.org/10.1371/journal.pone.0316531.g006

E(Mut) values of these variants were determined and presented in Fig 6B and S2 Table in S1 File. Among this set, 11 variants displayed more favorable E(Mut) values than Mut_C2, listed in the table embedded in Fig 6B. Mut_D34, which contains the V589H mutation, outperformed the other variants with the highest observed E(Mut) value of 5.2. Mut_D35, carrying the V589Q mutation with no charged side chain, also showed improved kinetic activity (E(Mut) = 3.0) compared to Mut_C2 (V589). The V589H and V589Q mutations with obvious increase in catalytic activity, are located in the palm domain of KOD pol. The distances of atom contacts between residues V, H, or Q at position 589 and the DNA strand in the crystal structure of KOD pol (PDB ID: 5MOF) are shown in Fig 6C, which are 3.9 Å, 1.7 Å, and 2.4 Å, individually. A closer distance implies a higher possibility of hydrogen bond formation, which can subsequently impact the binding affinity between DNA and the protein. In addition, we also speculated that the positively charged side chain of histidine (H) at position 589 can increase the binding affinity between KOD pol and the DNA strand, facilitating the incorporation process by virtue of strong and favorable electrostatic interaction with the double-stranded DNA. For example, Kropp et al. reported that the structure of KOD pol has a long crack with an electro-positive potential extending from the DNA binding region at the thumb domain upwards along the β-hairpin and the palm domain to the N-terminal domain [13]. The increased positive charge introduced in the DNA binding region of the polymerase results in enhanced affinity towards the template DNA, leading to a more stable binding [47]. Among the investigated positive mutation sites with E(Mut) >1, five were located in the palm domain (V589H/Q, T590K, V389I, S383T, Y384F), two in the thumb domain (T676K, V680M), one in the finger domain (Y496I), and one in the N-terminal region (T349I). In addition, residues 383/384/389 were located in the loop of the KOD pol structure, while residues 589/590/676/680 contributed to the binding between KOD pol and the DNA strand. The screening results showed that those mutated residues can either directly or indirectly influence the binding of KOD pol and the DNA strand, leading to an increase in the incorporation efficiency during polymerization. Therefore, we next performed further combinations with these well-performing mutants to explore possible interactions and synergies between the identified sites. To accomplish this, we generated stepwise combination variants based on Mut_D34 form the mutations that exhibited the best performance in this round of screening.

Stepwise combination of effective mutations

Based on the last-round screening results, the best-performing variant Mut_D34 was selected as the parent for a fourth round of stepwise combinational mutagenesis to further improve the catalytic activity. To comprehensively assess the catalytic performance of stepwise combined variants, we conducted kinetic assays towards both substrates of P/T-2Cy5 and modified dATP. These assays allowed us to evaluate the kinetic performance of each new variant individually with respect to the DNA strand and modified dATP. The relative kinetic performance was evaluated using Eq 4, which is similar to Eq 3. In this round of screening, the Mut_D34 variant served as a reference, with a catalytic efficiency (E(Mut)) value of 1.

(4)

To distinguish the relative kinetic performance towards two substrates, we used E(Mut)/P/T-2Cy5 to represent the kinetic measurement towards ranged P/T-2Cy5 concentration. Similarly, E(Mut)/modified dATP represents the kinetic measurement towards ranged modified dATP concentration.

The screening results for E(Mut)/P/T-2Cy5 of combined variants are shown in Fig 7A and 7B. We incorporated the V680M mutation (from Mut_D51) to the variant Mut_D34, generating the combination variant Mut_E1 which displayed a slight improvement in E(Mut), confirming that the addition of V680M mutation on Mut_D34 improved catalytic performance. We then added the T676K mutation (from Mut_D48) to Mut_E1, which introduced a positively charged side chain to the thumb domain. The resulting variant was designated as Mut_E2, which demonstrated that these three mutation sites displayed positive interactions for improving polymerase catalytic efficiency. Further single site mutations were individually added to Mut_E2. Four variants, Mut_E3 (Y349F), Mut_E4 (Y384F), Mut_E5 (Y496L), and Mut_E8(V389I), displayed increased E(Mut) compared to Mut_E2. We then superimposed these sites to obtain variants Mut_E9 and Mut_E10 with further improved kinetic performance. The E(Mut) value of Mut_E9, which carries the mutations S383T and Y384F, showed a 3-fold increase compared to Mut_E1 and approximately a 1.6-fold increase compared to Mut_E2. The E(Mut) value of Mut_E10 carrying S383T, Y384F, and V389I, showed a 4.5-fold increase compared to Mut_E1. However, attempts to further combine Mut_E10 with Y349F and Y496L did not result in any further improvement.

thumbnail
Fig 7. Screening results of stepwise combinational mutagenesis.

A) Relative kinetic screening results for the variants, which include their kinetic assays towards P/T-2CY5 (E(Mut)/P/T-2Cy5, displayed as green bars), as well as their assays towards modified dATP (E(Mut)/modified dATP, displayed as red stars). Mut_D34 was used as the parental variant in this round, and additional mutation sites were introduced using site-directed mutagenesis. The kinetic assays were conducted for 1–2 hours at 40°C with varying concentrations of P/T-2CY5 or modified dATP, using excitation at 530 nm and detection at 676 nm. B) List of the mutation sites and the corresponding kinetic results of the combined variants. C) Illustration of the evolution process of Mut_E10 (red spots), which comprises 11 mutation sites compared to the wild-type KOD pol. During its semi rational evolution process, key mutants (black spots) include Mut_C2, Mut_D34, Mut_E1, Mut_E2, and Mut_E9. D) The locations of eleven mutation sites of the variant Mut_E10 are shown in the crystal structure of KOD pol (PDB ID: 5MOF). These residues are shown as sticks, the rest of the protein structure is shown in cartoon format and colored grey. The sites located in the exonuclease region, including D141A and E143A, are colored light blue. The sites associated with the catalytic activity center, including L408I, Y409A, and A485E, are colored light magenta. The sites related to DNA binding, including S383T, Y384F, V389I, V589H, T676K, and V680M, are colored green. The DNA strand is shown in cartoon format and colored in light wheat. The 3’-O-azidomethyl-dATP is shown in sticks and colored cyan.

https://doi.org/10.1371/journal.pone.0316531.g007

E(Mut)/modified dATP values of these mutants were calculated and are also shown in Fig 7A and 7B. Interestingly, we found that some variants behaved inconsistently in kinetic experiments against P/T-2Cy5 and modified dATP. For example, Mut_E3 and Mut_E5 exhibited a two-fold increase compared to Mut_D34 in the P/T-2Cy5 kinetic assay but displayed poorer kinetic performance in the modified dATP assay. The likely explanation is that different structural and chemical properties of DNA strand and dATP as substrates may lead to different kinetic characteristics of their interactions with different variants. However, most variants like Mut_E8 and Mut_E10 exhibit consistent kinetic behaviors in both types of assays. Notably, Mut_E10 exhibited the best kinetic parameters for both substrates, with a relative increase of over 23-fold in E(Mut)/PT-2Cy5 compared to Mut_C2, presenting high efficiency in incorporating modified dATP into the DNA strand. The whole semi-rational evolution process leading to the identification of Mut_E10, is depicted in Fig 7C and 7D.

Overall, our data prove that combinatorial mutations introduced at the DNA-binding region of KOD pol can significantly alter its efficiency in incorporating modified nucleotides during the polymerization process.

Sequencing performance validation of KOD variants in NGS

In order to examine whether the observed improvement in the polymerization efficiency of Mut_E10 brought significant values in kinetic performance in sequencing applications, we tested the performance of Mut_E10 in different NGS platforms. Since the launch of the Human Genome Project, next-generation sequencing (NGS) platforms have had a very significant impact on biological research [48]. We first compared the sequencing performance of Mut_E10 to that of WT and Mut_C2 on the BGISEQ-500 platform with SE50 sequencing, which can evaluate the sequencing quality improvements achieved through the usage of the engineered variants. Furthermore, we conducted a more thorough sequencing evaluation of Mut_E10 on the MGISEQ-2000 platform, which involved PE100 sequencing. We employed rigorously purified proteins of WT, Mut_C2, and Mut_E10 in the above sequencing tests, to minimize the presence of impurities that could interfere with the sequencing assays. The protein purity of each sample was confirmed to exceed 95% through SDS-PAGE analysis, shown in S4 Fig in S1 File.

The sequencing results of WT KOD pol, Mut_C2, and Mut_E10 on the BGISEQ-500 platform are presented in Fig 8. The BGISEQ-500 platform featuring combinatorial Probe-Anchor Synthesis (cPAS) and DNA Nanoballs (DNB) technologies is a widely used NGS platform [49]. The library used for sequencing was E. coli E320 standard library, the number of sequencing cycles was SE50, and the reaction temperature was 55°C. We compared the performance of the three different enzymes, WT KOD pol, Mut_C2, and Mut_E10, by replacing the original enzyme in the supplied sequencing kit of BGISEQ-500 sequencing platform. Notably, the WT KOD pol group did not show any bright spots on the chip (Fig 8A), indicating that no sequencing reaction occurred. In contrast, the Mut_C2 group presented bright spots, indicating the successful extension of modified dNTPs, with the bright spots representing the incorporated fluorescently labeled dNTPs to the DNB, as shown in Fig 8A. Then, the sequencing results of Mut_E10 and Mut_C2 were analyzed and compared. The quality distribution of the sequencing reads, including Q30, ESR, and mapping rate, represents the overall sequencing quality of the enzyme-replaced platforms, as shown in Fig 8B and 8D. The results revealed that Mut_E10 exhibited superior performance compared to Mut_C2 which has an apparent decrease after 20 cycles. A higher Q value indicates a lower probability (P) of base misidentification [50]. Mut_E10 had a higher Q30 value (> 93%) as well as superior performance in terms of lag % and runon % compared to Mut_C2 throughout the entire cycle, making it highly suitable for NGS applications (Fig 8B and 8C). Based on these results, we conclude that Mut_E10 is a promising candidate for application in the BGISEQ-500 sequencing platform. Subsequently, we investigated the performance of Mut_E10 on another sequencing platform, CoolMPS sequencing.

thumbnail
Fig 8. Sequencing results of WT KOD pol, Mut_C2, and Mut_E10 on the BGISEQ-500 platform.

A) Representative chip images in the sequencing process. During the sequencing process, modified nucleotides labeled with dyes are incorporated to the pre-loaded DNB by DNA polymerases, and then washed out to remove excess nucleotides before chip imaging. The WT KOD pol group showed no bright spots on the chip, indicating no fluorescently labeled nucleotides were polymerized. In contrast, the Mut_C2 group exhibited bright spots, indicating the successful extension of modified nucleotides. The area of a small square in the chip shown in the diagram is 0.882*0.882 mm. B) Quality portion distribution on reads (Q30) for Mut_C2 (blue line) and Mut_E10 (green line). Mut_E10 had a stable curve and performed better in each sequencing cycle in comparison to Mut_C2. C) Lag and Runon comparison for Mut_C2 (blue) and Mut_E10 (green) in sequencing results. Mut_E10 had lower values, indicating better extension efficiency. D) Mapping rate and ESR comparison between Mut_C2 (black squares) and Mut_E10 (olive). Mut_E10 had a higher mapping rate and ESR, suggesting better sequencing quality.

https://doi.org/10.1371/journal.pone.0316531.g008

Furthermore, we applied Mut_E10 on the MGISEQ-2000 platform featuring CoolMPS technology for PE100 sequencing. CoolMPS is a novel massively parallel sequencing chemistry, which was developed by Drmanac’s group at MGI [51]. The sequencing experiments were performed in duplicates on a single chip in Lane1 and Lane2. The library used for sequencing was E. coli E320 standard library, the reaction temperature was 55°C, and the sequencing process was carried out following the published protocols [52]. Thesequencing results are listed in a concise table in Fig 9A. Both the first and second strands exhibited high mapping rates exceeding 99%. The ESR value was reported to be above 78%, and the average error rate was less than 0.16%, further indicating the accuracy and quality of the sequencing data. During the base-calling process of sequencing, the Q30 score remained consistently higher than 95% for the first-strand sequencing and a gradual decrease in Q30 was observed for the second-strand sequencing (Fig 9B), which is considered acceptable for PE100 testing. The average lag % and runon % values for both strands were comparatively low and are depicted in Fig 9C and 9D. In addition, Korostin reported that Illumina HiSeq 2500 and MGISEQ-2000 present similar performance characteristics after comparing both whole-genome sequencings (PE150) data in terms of sequencing quality, number of errors, and performance [52]. Our sequencing data also exhibited comparable performance to that of HiSeq 2500, including high mapping rates (>99%) and Q30 values (>97%), with Q30 being around 96%. The sequencing data obtained is satisfactory for the MGISEQ-2000 sequencing platform, according to the reported range of NGS analysis generated [50]. Taken together, these results demonstrate that Mut_E10 is suitable for practical applications in the MGISEQ-2000 platform.

thumbnail
Fig 9. Sequencing results of Mut_E10 on the MGISEQ-2000 platform.

A) Overview of the sequencing information catalog and major sequencing quality metrics or data, with PE100 sequencing. Note: Error Rate stands for base error rate; Split Rate: barcode split rate. Value_1 and Value_2 are duplicates of sequencing data obtained from Lane 1 and Lane 2 of the same chip, respectively, on the same chip during sequencing. B) The average Q30 score of Value_1 and Value_2 from the during sequencing data in panel A), which serves as an indicator of base calling accuracy. C) and D) show the average lag and runon percentages of Value_1 and Value_2 during sequencing in panel A), respectively. These values serve as indicators of the completeness of the polymerization reaction that occurred within the DNB.

https://doi.org/10.1371/journal.pone.0316531.g009

Finally, considering that the sequencing results were generated by directly replacing the enzyme without any optimization, we believe that the sequencing quality of Mut_E10 on both sequencing platforms can be further improved. One potential avenue for improvement lies in optimizing the sequencing buffer to further enhance the performance and accuracy of the sequencing process [53]. Consequently, the sequencing quality achieved with Mut_E10 on both the BGISEQ-500 and MGISEQ-2000 platforms fell within the acceptable range comparable to the "gold standard" set by NGS analysis generated using the Illumina platform [52].

Discussion

Mut_E10 displayed satisfactory performance on two types of sequencing platforms including BGISEQ-500 and MGISEQ-2000, as indicated by our sequencing data. These results suggest that Mut_E10 possesses the potential for commercialization and can be further adaptable to different sequencing platforms. To our knowledge, the cPAS technology of the BGISEQ-500 platform utilized 3’-O-blocked reversible terminators labeled with four different fluorescent dyes (four colors) [9, 54]. On the other hand, the CoolMPS technology of the MGISEQ-2000 platform achieved sequencing using monophosphate nucleotides with a 3’-O-azidomethyl blocking group and an NHS linker on phosphate [55, 56], which can be rapidly recognized (within 30 seconds) by nucleobase-specific fluorescently labeled antibodies [51]. Previous studies reported that DNA polymerases can incorporate dNMPs at apurinic/apyrimidinic (AP) sites or similar damaged sites but with generally low efficiency and a propensity for error insertion [57, 58]. Remarkably, Mut_E10 exhibits high efficiency in incorporating both chemically modified dNTPs and dNMPs. The incorporation efficiency of unnatural nucleotides depends on the linker used to anchor the modification group to the nucleobase, especially with sterically demanding groups [31]. Therefore, we speculate that the steric size requirements for the incorporation of chemically modified dNTPs and dNMPs by KOD DNA polymerases may be similar in this study. However, the detailed mechanisms responsible for the incorporation of both unnatural substrates in these two sequencing platforms, are yet to be fully understood. Therefore, we next focus on the potential causes or mechanisms by which these identified mutations improve the polymerization efficiency of Mut_E10. Therefore, we conducted further analysis on these identified mutations. Except for the two sites (D141A, E143A) in the exonuclease region, the remaining nine mutation sites can be categorized into three distinct categories.

The first category of mutational sites includes those that affect the binding of Mut_E10 polymerase to the DNA strand. These sites are S383T, Y384F, V389I, V589H, T676K, and V680M, as shown in Fig 7D. MM-GBSA calculations indicate that modifications at these sites enhanced the ability of Mut_E10 to bind the DNA strand, accelerating DNA strand capture. Notably, residues 589H and 676K possess positive charges and are expected to exhibit higher binding affinities for negatively charged DNA when the polymerase binds to the DNA strand. As illustrated in Fig 7D, these two residues are located on opposite sides of the DNA double strand within the coiled structure of the polymerase, approximately 20–30 angstroms from the catalytic site. Due to their remote location relative to the catalytic site, these positions have difficulties in effectively receiving energy derived from adenosine phosphate hydrolysis through the protein structure. The transmission of this energy might occur through the double-stranded DNA (dsDNA) instead. Therefore, enhancing the polymerase’s binding affinity to dsDNA at these key positions can more effectively transmit energy, thereby enhancing protein conformational changes. Residues 383T, 384F, and 389I are located near the DNA strand and nucleotides and form the back of the active pocket, which likely contribute positively to the formation of a stable active pocket. Furthermore, residues 383T, 384F, 389I, and 676K are situated on the coil structure. To avoid disrupting the original coil structure, we screened for mutants that were similar to the original residues. These findings indicate that a stable coil or loop structure of the DNA polymerase plays an important role in modified nucleotide polymerization activity. Excessive and disruptive changes to these residues on the coil structure are thus not recommended.

The second category of sites includes those where mutations altered the dNTP catalytic pockets, such as L408I and Y409A, as shown in Fig 7D. The modified dATP structure (with an azido methyl terminal group at the 3’ end, a long linker, and Cy3 fluorescent group) is depicted in Fig 1B. When the modified dATP enters the catalytic pocket, its blocking group would collide with the surrounding L408/Y409/P410 residue side chains, as shown in Fig 7D. The mutations in motif A (L408I, Y409A, P410) of Mut_E10 have smaller side chains, thus provide more space to accommodate the incoming modified dATP, making Mut_E10 more active for modified dATP incorporation. Interestingly, Mut_1 with motif A (L408A, Y409A, P410A) should have even more space in this area for modified dATP, however, the catalytic activity of this variant was not optimal. Therefore, there may be other critical factors that should be considered affecting the catalytic efficiency of DNA polymerase besides the space in motif A, such as metal ions binding. Metal ions can modify the incorporation efficiency of DNA polymerase for dNMPs or dNTPs [59]. Further study could be done to investigate the interactions between metal ions and crucial residues. Overall, our findings suggest that when engineering DNA polymerase to incorporate a modified substrate with a larger group, using small side chain residues in the catalytic pockets is recommended.

In the third category of mutation sites, the A485E mutation affects the conformational change of the finger domain of KOD pol, as shown in Fig 7D. The binary complex structure of KOD pol that bound only to the DNA strand was compared with the ternary complex structure of KOD pol that bound to DNA complex and dATP, as depicted in Fig 2A. Previous reports have suggested that the movement of the finger domain may be advantageous for the fidelity of KOD DNA polymerase [20]. In Mut_E10, the A485 residue on the rotating alpha helix of the finger domain was substituted with Glu, and this resulted in the formation of a salt bridge (as illustrated in Fig 3B, top center) with Q332 on the opposite helix of the exonuclease domain, thereby preventing the rotation of the finger domain. Evidently, this mutation had a detrimental effect on the catalytic activity of KOD pol when natural nucleotides were used. However, in the case of modified nucleotides employed in NGS, which possess long linkers and fluorescent groups, the locking of the finger domain has the potential to enhance the stability of the modified substrate access channel in the substrate channel. The experimental outcomes indicated that this mutation significantly improved the enzyme’s catalytic efficiency.

In addition, the semi-rational evolution strategy employed for enhancing the efficiency of modified nucleotide incorporation by KOD pol exhibited satisfactory efficiency compared to other reported methods [20, 60]. Kennedy et al. conducted a steady-state kinetic analysis to examine the DNA synthesis process by DNA polymerase using natural or modified nucleotides [60]. They employed denaturing polyacrylamide gel electrophoresis to distinguish and assess the extension of individually modified nucleotides in an exogenous primer. In contrast, our approach utilizes a fluorescence signal as an indicator for kinetic assays, offering advantages such as higher throughput, increased sensitivity, and a more streamlined analysis process. Nikoomanzar et al. developed a microfluidic-based deep mutational scanning method to evolve a replicative DNA polymerase (KOD pol) from Thermococcus kodakarensis for TNA synthesis [20]. The authors identified positive sites in the finger subdomain of KOD pol through a single round of sorting from 912 mutants and subsequently combined them step-by-step to obtain a double mutant variant. In contrast, our screening strategies achieved higher screening efficiency, resulting in Mut_E10 containing nine mutation sites (besides 141A, 143A), with fewer experimentally screened mutants (less than150). Nevertheless, our overall semi-rational screening method using a microplate reader does not possess an overall lower throughput than the microfluidic-based droplet screening strategy.

Therefore, for future studies, the optimizing screening method could be a combination of microfluidic-based droplet screening with FRET based assays, which can overcome the throughput limitations of the present study. As, we only tested a small fraction of the variants from the computational library and combinatory library, and future studies could employ higher-throughput screening methods to investigate more variants that may have been missed during the evolutionary process. Moreover, further exploration of residues for continually improving catalytic efficiency of KOD pol can be conducted with the amino acid residues neighboring these key positions, such as motif B including residues 484/485/486, and residues around 389/589/676/680.

Conclusion

To conclude, this study presents a comprehensive workflow for engineering WT KOD DNA polymerase to catalyze unnatural nucleotides for application in NGS. The workflow can be divided into five stages: construction of the screening strategy, identification of key residues in the active pocket for study, identification of residues located in the DNA strand binding region for improvement, step-wise combination, and characterization of variants in the specific NGS application. The variant Mut_E10, which possesses eleven mutation sites, was found to achieve successful compatibility on both the tested sequencing platforms, namely BGISEQ-500 and MGISEQ-2000. These beneficial mutation sites identified in this study could act as inspiration for the engineering and evolution of other archaeal B-family DNA polymerases. In addition, the successful engineering of KOD DNA polymerase with improved catalytic efficiency for unnatural nucleotides proves the great potential of the enzyme engineering strategy constructed, which offers a new solution of polymerase engineering to fulfill versatile applications.

Materials and methods

Reagents and instruments

The primer and template labeled at the 5’-terminus by Cy5 dye were ordered from Invitrogen, by HPLC purification. The primer-template (P/T-2Cy5) was prepared by mixing the primer and template at a 1:1 molar ratio and annealing at 80°C for 10 min and then cooling down to room temperature in a water bath. The sequence information of P/T—2Cy5 is shown in Fig 1C. The primers designed with mutations were ordered from Genscript Biotech. 3’-O-azidomethyl-dATP-Cy3 (modified dATP) was supplied by the BGI synthetic chemistry group, and the structure was shown in Fig 1B. The site-directed mutagenesis kits were purchased from Thermo Fisher. Potassium phosphate, MgSO4, KCl, (NH4)2SO4, Tris, NaCl, HCl, EDTA, NaOH, imidazole, glycerol, lysozyme, LB broth, kanamycin, PMSF, and IPTG were purchased from Thermo Fisher. 12% SDS-PAGE gel and gel running buffer were purchased from Bio-Rad. Affinity chromatography columns (HisTrap HP 5ml column) and ion exchange columns (HiTrap Q FF 5mL and HiTrap SP HP 5ml column) were purchased from GE Healthcare. The ÄKTA Pure Protein Purification system was purchased from GE Healthcare. The microplate reader was purchased from Bio Tek.

Mutation library construction

The codon-optimized gene encoding this exonuclease-deficient KOD pol (named Mut_1 (D141A, E143A, L408A, Y409A, P410A), Seq_1) was synthesized by Genscript Biotech and then cloned into vector pD441 with kanamycin resistance. The corresponding expressed protein features a 6xHis tag and TEV protease cleavage site at the N-terminus. The first mutant library consisted of single-site saturation mutagenesis at four positions: 408, 409, 410, and 485. The saturation mutagenesis at positions 408, 409, 410, and 485 was performed based on the background of triple alanine substitutions at the other three positions. The PCR reaction conditions were as follows: one cycle at 98°C for 10 s; 20 cycles at 98°C for 10 s, 72°C for 2.5 min, followed by elongation at 72°C for 5 min and hold at 4°C. The PCR products were visualized by 1.2% agarose gel electrophoresis and ultraviolet light. Then, 1 μL of DpnI (NEB) was added to the PCR mix and incubated for 2 h at 37°C. 5μL of the digested PCR product was transformed into the 50 μL E. coli DH5α competent cells (Tiangen). Recombinant plasmids were prepared by using the QIAprep Spin Miniprep Kit (QIAGEN) and confirmed by sequencing analysis. 1 μL of each correct plasmid was transformed into 50 μL E. coli BL21(DE3) competent cell (Tiangen) for protein expression.

Protein expression and purification

1 mL of LB medium in a 96-deep-well plate was used for small-scale expression, and 1 L of LB medium in a conical flask was used for large-scale expression. Cells grown in LB medium containing 50 μg/mL kanamycin were induced at OD600 nm = 0.6–0.8 by the addition of 0.5 mM IPTG. Protein production was carried out at 25°C, 220 rpm overnight.

Semi-purification was performed using cell pellets collected from 1 mL culture in a 96-deep-well plate. The cell pellets were resuspended in Buffer 1 (20 mM Tris-HCl, 10 mM KCl, 10 mM (NH4)2SO4, 0.1% Triton, 4 mM MgSO4, 1.25 mM PMSF, and 1 mg/mL lysozyme, pH 7.6) with a ratio of 0.04 g cell pellets per mL buffer. The resuspended cells were incubated at 37°C for 10 minutes, followed by thermal denaturation at 80°C for 30 minutes. The supernatant containing proteins was collected after centrifugation at 12,000 X g for 20 minutes. The protein concentration was measured and estimated by using the Bradford protein assay [61].

We rigorously purified variants protein for use on NGS platforms as follows. For protein purification, cell pellets collected from 1 L cultivation were resuspended in Buffer 2 (500 mM NaCl, 5% glycerol, 20 mM imidazole, 1.25 mM PMSF, 50 mM potassium phosphate, pH 7.4) and then disrupted with a high-pressure homogenizer AH-1500 purchased from ATS engineering company. After a 30-minute thermal denaturation at 80°C, the centrifugation at 12,000 rpm 4°C (Beckman Avanti J-26) for 30 mins was performed to remove cell debris. After centrifugation, the supernatant was filtered through a 0.22 μm membrane filter. The subsequent purification process was performed using ÄKTA pure 25 (GE Healthcare) with Nickel affinity chromatography (5 mL HisTrap HP column, GE Healthcare), anion ion exchange chromatography (5 mL HiTrap Q FF column, GE Healthcare), and cation ion exchange chromatography (5 mL HiTrap SP HP column, GE Healthcare), sequentially. For the affinity purification process, the supernatant was loaded onto the pre-equilibrated column with a flow rate of 2.5 mL/min so that the retention time will be 2 min. A wash step using 50 mL Buffer 2 with a flow rate of 5 mL/min was added before elution. The elution procedure was performed with a linear gradient of Buffer 3 (500 mM NaCl, 5% glycerol, 500 mM imidazole, 50 mM potassium phosphate, pH 7.4) from 0 to 100% in 10 CV with a flow rate of 5 mL/min. Fractions with 280 nm UV signal above 100 mAu were collected. The collected samples from Ni purification were diluted 6-fold with Buffer 4 (5% glycerol, 25 mM potassium phosphate, pH 6.6) to decrease the concentration of NaCl to 50 mM. The diluted sample was loaded onto the pre-equilibrated HiTrap Q FF 5 mL column (GE Healthcare) at a flow rate of 2.5 mL/min. The flowthrough sample was collected and loaded onto the pre-equilibrated HiTrap SP HP 5 mL column (GE Healthcare) with a flow rate of 2.5 mL/min. The HiTrap SP HP 5 mL column was pre-equilibrated with Buffer 5 (50 mM NaCl, 5% Glycerol, 50 mM Potassium Phosphate, pH 7.4). The elution procedure was performed with a linear gradient of Buffer 6 (1 M NaCl, 5% Glycerol, 50 mM Potassium Phosphate, pH 6.6) from 0 to 60% in 10 CV. The first peak eluted was collected. Finally, the collected sample was dialyzed with Buffer 7 (20 mM Tris, 200 mM KCl, 0.2 mM EDTA, 5% glycerol, pH 7.4) at 4°C overnight and then stored at -20°C with 50% glycerol. Protein concentration was determined by measuring the absorbance at 280 nm using a microplate reader (Bio Tek) and calculated using the extinction coefficient of 1.370 M-1 cm-1 predicted by the ExPASy server. The purity of the protein was analyzed using 12% SDS-PAGE.

Enzyme activity screening of KOD variants

The polymerization reaction mixture contained 0.1 μM P/T-2Cy5, 0.25 μM modified dATP, 2.5 μM native dC/G/TTP each, 2 μg BSA (10 mg/mL), 1X KOD Screening Buffer (20 mM Tris-HCl, 10 mM KCl, 10 mM (NH4)2SO4, 0.1% Triton, 4 mM MgSO4, pH 8.8) in a total volume of 30 μL. Then, 0.5 μg KOD protein (0.1 μM) in 20 μL was added. The reaction was initiated by adding each semi-purified KOD mutant protein. The enzyme activity assay was performed in 384-well (black clear-bottom) plates (Corning®) at 40°C for 1–2 hours. The FRET Cy5 fluorescence signal was measured as relative fluorescence units (RFUs) in appropriate time intervals by exciting at 530 nm and detecting the emission at 676 nm. Measurements were performed with a gain setting at 60 and shaking for 10 seconds before measurement in a microplate reader (Bio Tek). The enzyme activity (reaction rate V) was defined as the maximum slope of the FRET Cy5 signal, expressed as RFUs/min. The data were analyzed and plotted using Origin software.

Enzyme kinetic assays

The reaction mixture contained 10 μM native dC/G/TTP each, 2 μg BSA (10 mg/mL), 1X KOD Screening Buffer (20mM Tris-HCl, 10 mM KCl, 10 mM (NH4)2SO4, 0.1% Triton, 4 mM MgSO4, pH 8.8) in a total volume of 30 μL. Then, 0.5 μg KOD protein (0.1 μM) in 20 μL was added. The kinetic assays were carried out with varying concentrations of P/T-2Cy5 or modified dATP ranging from 0 to 6 μM, while the concentration of modified dATP or P/T-2Cy5 remained constant at 2 μM. The reaction was initiated by adding KOD mutant protein. The assay was also performed in 384-well (black clear-bottom) plates (Corning®) at 40°C for 1–2 hours. The FRET Cy5 fluorescence signal was measured by exciting at 530 nm and detecting the emission at 676 nm. Measurements were performed with a gain setting at 60 and shaking for 10 seconds before measurement in a microplate reader (Bio Tek). The maximum slope (reaction rate V) of the FRET Cy5 signal for each concentration of P/T-2Cy5 was calculated and used for determining the kinetic parameters (kcat and Km). The kinetic data were fitted non-linearly using the Michaelis-Menten equation with GraphPad Prism 5 to obtain kcat and Km.

MM-GBSA screening method

The complex structure of KOD pol and dsDNA adopts the 5OMF PDB structure. Additional small molecules in both structures were removed, and the systems were energy minimized in 80 mM sodium chloride aqueous solution, followed by NVT ensemble MD relaxation of 50 ns [62]. We used the Modeler 9.19 software to carry out saturation mutagenesis at 93 residues on the polymerase and then processed all 1860 mutants uniformly. Due to this large number of calculations, we designed an automatic processing workflow in Python. In an 80 mM sodium chloride aqueous solution, five rounds of energy minimization were first carried out for the structure generated by homologous modeling. In the five rounds of simulations, the binding coefficient to heavy atoms of protein and DNA was gradually weakened from 80 kcal / (mol · Å) to 0 kcal / (mol · Å). After that, NVT ensemble MD was carried out at 350 K, and finally, NPT ensemble MD of 10–25 ns was carried out at 310 K. We sampled the trajectory of the system that reached equilibrium in the last part and calculated the binding energy by the MM-GBSA method [63].

Sequencing in the NGS platform

BGISEQ-500 is a desktop platform developed for DNA or RNA sequencing by BGI Research. It utilizes DNA NanoBalls (DNBs) technology (DNBSEQ). The dNTPs used in DNBSEQ technology were modified with four different fluorescent dyes attached to the base and a reversible blocking group in its 3’-side. To evaluate the sequencing performance of Mut_E10 in the BGISEQ-500 platform, an E. coli library prepared by following the instructions described in the manufacturer’s use manual of MGIEasy PCR-Free DNA Library Prep Kit (PN: 1000013452, MGI-tech) was used to make DNB following BGISEQ-500 protocol [54]. The DNBs were then loaded into the patterned nanoarrays. The SE50 sequencing kit was supplied by the BGI sequencing group. WT KOD pol, Mut_C2, and Mut_E10 were used to replace the supplied enzyme in the sequencing kit and performed sequencing tests in the BGISEQ-500 sequence platform employing the SE50 mode. Base calling and mapping were performed as previously described [49].

MGISEQ-2000 is a high-throughput platform developed by MGI Tech. The sequencing performance of Mut_E10 was also tested in the DNBSEQ-2000 platform employing CoolMPS technology. The cold nucleotides used in CoolMPS technology are monophosphate nucleotides with a 3’-O-azidomethyl blocking group. CoolMPS PE100 High-throughput Sequencing kit (PN: 1000016935) was purchased from MGI-Tech. The same E. coli library used for BGISEQ-500 was the same for MGISEQ-2000. The sequencing process followed the instructions described in the MGISEQ-2000 High-throughput Sequencing Set User Manual [51]. Mut_E10 was used to replace the supplied enzyme in the CoolMPS PE100 sequencing kit and sequence the E. coli library in different runs. Base calling and mapping steps were performed as reported previously [52].

The sequencing data that support the conclusions of this study are accessible in the CNGB Sequence Archive(CNSA) [64] of the China National GeneBank DataBase (CNGBdb) [65], under accession number CNP0000668.

Sequence information

>Seq_1(Mut_1)\\

MHHHHHHENLYFQGILDTDYITEDGKPVIRIFKKENGEFKIEYDRTFEPYFYALLKDDSAIEEVKKITAERHGTVVTVKRVEKVQKKFLGRPVEVWKLYFTHPQDVPAIRDKIREHPAVIDIYEYDIPFAKRYLIDKGLVPMEGDEELKMLAFAIATLYHEGEEFAEGPILMISYADEEGARVITWKNVDLPYVDVVSTEREMIKRFLRVVKEKDPDVLITYNGDNFDFAYLKKRCEKLGINFALGRDGSEPKIQRMGDRFAVEVKGRIHFDLYPVIRRTINLPTYTLEAVYEAVFGQPKEKVYAEEITTAWETGENLERVARYSMEDAKVTYELGKEFLPMEAQLSRLIGQSLWDVSRSSTGNLVEWFLLRKAYERNELAPNKPDEKELARRRQSYEGGYVKEPERGLWENIVYLDFRSAAASIIITHNVSPDTLNREGCKEYDVAPQVGHRFCKDFPGFIPSLLGDLLEERQKIKKKMKATIDPIERKLLDYRQRAIKILANSYYGYYGYARARWYCKECAESVTAWGREYITMTIKEIEEKYGFKVIYSDTDGFFATIPGADAETVKKKAMEFLKYINAKLPGALELEYEGFYKRGFFVTKKKYAVIDEEGKITTRGLEIVRRDWSEIAKETQARVLEALLKDGDVEKAVRIVKEVTEKLSKYEVPPEKLVIHEQITRDLKDYKATGPHVAVAKRLAARGVKIRPGTVISYIVLKGSGRIGDRAIPFDEFDPTKHKYDAEYYIENQVLPAVERILRAFGYRKEDLRYQKTRQVGLSAWLKPKGT*

Supporting information

S1 File. Supplementary information and data.

https://doi.org/10.1371/journal.pone.0316531.s001

(DOCX)

Acknowledgments

This study was supported by the Shenzhen Engineering Laboratory Molecular Enzymology (Fa Gaishen [2018] No. 958). W.L. Thanks to China National GeneBank, BGI, Shenzhen 518120, China & Sequencing Platform from BGI Research, Shenzhen 518083, China & Sequencing Platform from MGI Tech, Shenzhen 518083, China.

References

  1. 1. Zhang L, Kang M, Xu J, Huang Y. Archaeal DNA polymerases in biotechnology. Appl Microbiol Biotechnol 2015;99:6585–97. pmid:26150245
  2. 2. Terpe K. Overview of thermostable DNA polymerases for classical PCR applications: from molecular and biochemical fundamentals to commercial systems. Appl Microbiol Biotechnol 2013;97:10243–54. pmid:24177730
  3. 3. Pinheiro VB. Engineering-driven biological insights into DNA polymerase mechanism. Curr Opin Biotechnol 2019;60:9–16. pmid:30502514
  4. 4. Aschenbrenner J, Marx A. DNA polymerases and biotechnological applications. Curr Opin Biotechnol 2017;48:187–95. pmid:28618333
  5. 5. Wang Y, Shi Y, Hellinga HW, Beese LS. Thermally controlled intein splicing of engineered DNA polymerases provides a robust and generalizable solution for accurate and sensitive molecular diagnostics. Nucleic Acids Res 2023:gkad368. pmid:37166959
  6. 6. Leconte AM, Patel MP, Sass LE, McInerney P, Jarosz M, Kung L, et al. Directed Evolution of DNA Polymerases for Next-Generation Sequencing. Angew Chem 2010;122:6057–60. pmid:20629059
  7. 7. Chen C-Y. DNA polymerases drive DNA sequencing-by-synthesis technologies: both past and present. Front Microbiol 2014;5. pmid:25009536
  8. 8. Huber C, von Watzdorf J, Marx A. 5-methylcytosine-sensitive variants of Thermococcus kodakaraensis DNA polymerase. Nucleic Acids Res 2016:gkw812. pmid:27651460
  9. 9. Guo J, Xu N, Li Z, Zhang S, Wu J, Kim DH, et al. Four-color DNA sequencing with 3′- O -modified nucleotide reversible terminators and chemically cleavable fluorescent dideoxynucleotides. Proc Natl Acad Sci 2008;105:9145–50. pmid:18591653
  10. 10. Gardner AF, Jackson KM, Boyle MM, Buss JA, Potapov V, Gehring AM, et al. Therminator DNA Polymerase: Modified Nucleotides and Unnatural Substrates. Front Mol Biosci 2019;6:28. pmid:31069234
  11. 11. Rashid N, Aslam M. An overview of 25 years of research on Thermococcus kodakarensis, a genetically versatile model organism for archaeal research. Folia Microbiol (Praha) 2020;65:67–78. pmid:31286382
  12. 12. Wynne SA, Pinheiro VB, Holliger P, Leslie AGW. Structures of an Apo and a Binary Complex of an Evolved Archeal B Family DNA Polymerase Capable of Synthesising Highly Cy-Dye Labelled DNA. PLoS ONE 2013;8:e70892. pmid:23940661
  13. 13. Kropp HM, Diederichs K, Marx A. The Structure of an Archaeal B-Family DNA Polymerase in Complex with a Chemically Modified Nucleotide. Angew Chem Int Ed 2019;58:5457–61. pmid:30761722
  14. 14. Eriksen DT, Lian J, Zhao H. Protein design for pathway engineering. J Struct Biol 2014;185:234–42. pmid:23558037
  15. 15. Tobin MB, Gustafsson C, Huisman GW. Directed evolution: the ‘rational’ basis for ‘irrational’ design. Curr Opin Struct Biol 2000;10:421–7. pmid:10981629
  16. 16. Siegel JB, Zanghellini A, Lovick HM, Kiss G, Lambert AR, St.Clair JL, et al. Computational Design of an Enzyme Catalyst for a Stereoselective Bimolecular Diels-Alder Reaction. Science 2010;329:309–13. pmid:20647463
  17. 17. Ge Q, Jing Z, Ping Z, Jibin S, Zhoutong S. Recent advances in directed evolution 2018:9.
  18. 18. Coco WM, Levinson WE, Crist MJ, Hektor HJ, Darzins A, Pienkos PT, et al. DNA shuffling method for generating highly recombined genes and evolved enzymes. Nat Biotechnol 2001;19:354–9. pmid:11283594
  19. 19. Tubeleviciute A, Skirgaila R. Compartmentalized self-replication (CSR) selection of Thermococcus litoralis Sh1B DNA polymerase for diminished uracil binding. Protein Eng Des Sel 2010;23:589–97. pmid:20513707
  20. 20. Nikoomanzar A, Vallejo D, Chaput JC. Elucidating the Determinants of Polymerase Specificity by Microfluidic-Based Deep Mutational Scanning. ACS Synth Biol 2019;8:1421–9. pmid:31081325
  21. 21. Zeng W, Guo L, Xu S, Chen J, Zhou J. High-Throughput Screening Technology in Industrial Biotechnology. Trends Biotechnol 2020;38:888–906. pmid:32005372
  22. 22. Zhang C, Johnson LW. Microfluidic Control of Fluorescence Resonance Energy Transfer: Breaking the FRET Limit*. Angew Chem 2007:4.
  23. 23. Wu PG, Brand L. Resonance Energy Transfer: Methods and Applications. Anal Biochem 1994;218:1–13. pmid:8053542
  24. 24. Chan FK-M, Siegel RM, Zacharias D, Swofford R, Holmes KL, Tsien RY, et al. Fluorescence resonance energy transfer analysis of cell surface receptor interactions and signaling using spectral variants of the green fluorescent protein. Cytometry 2001;44:361–8. pmid:11500853
  25. 25. Stryer L, Thomas DD, Meares CF. Diffusion-Enhanced Fluorescence Energy Transfer. Annu Rev Biophys Bioeng 1982;11:203–22. pmid:7049062
  26. 26. Chen Y, Huang W-C, Yang C-S, Cheng F-J, Chiu Y-F, Chen H-F, et al. Screening strategy of TMPRSS2 inhibitors by FRET-based enzymatic activity for TMPRSS2-based cancer and COVID-19 treatment. Am J Cancer Res 2021;11:827–36. pmid:33791156
  27. 27. Kuwahara M, Takano Y, Kasahara Y, Nara H, Ozaki H, Sawai H, et al. Study on Suitability of KOD DNA Polymerase for Enzymatic Production of Artificial Nucleic Acids Using Base/Sugar Modified Nucleoside Triphosphates. Molecules 2010;15:8229–40. pmid:21076389
  28. 28. Atomi H, Fukui T, Kanai T, Morikawa M, Imanaka T. Description of Thermococcus kodakaraensis sp. nov., a well studied hyperthermophilic archaeon previously reported as Pyrococcus sp. KOD1. Archaea 2004;1:263–7. pmid:15810436
  29. 29. Sawai H, Ozaki AN, Satoh F, Ohbayashi T, Masud MM, Ozaki H. Expansion of structural and functional diversities of DNA using new 5-substituted deoxyuridine derivatives by PCR with superthermophilic KOD Dash DNA polymerase Electronic supplementary information (ESI) available: sequencing of the PCR products (108mer DNA) from substrates 1 and 2. See http://www.rsc.org/suppdata/cc/b1/b107838k/. Chem Commun 2001:2604–5.
  30. 30. Horhota A, Zou K, Ichida JK, Yu B, McLaughlin LW, Szostak JW, et al. Kinetic Analysis of an Efficient DNA-Dependent TNA Polymerase. J Am Chem Soc 2005;127:7427–34. pmid:15898792
  31. 31. Hottin A, Marx A. Structural Insights into the Processing of Nucleobase-Modified Nucleotides by DNA Polymerases. Acc Chem Res 2016;49:418–27. pmid:26947566
  32. 32. Ghosh P, Kropp HM, Betz K, Ludmann S, Diederichs K, Marx A, et al. Microenvironment-Sensitive Fluorescent Nucleotide Probes from Benzofuran, Benzothiophene, and Selenophene as Substrates for DNA Polymerases. J Am Chem Soc 2022;144:10556–69. pmid:35666775
  33. 33. Hoshino H, Kasahara Y, Fujita H, Kuwahara M, Morihiro K, Tsunoda S, et al. Consecutive incorporation of functionalized nucleotides with amphiphilic side chains by novel KOD polymerase mutant. Bioorg Med Chem Lett 2016;26:530–3. pmid:26627581
  34. 34. Sabat N, Katkevica D, Pajuste K, Flamme M, Stämpfli A, Katkevics M, et al. Towards the controlled enzymatic synthesis of LNA containing oligonucleotides. Front Chem 2023;11:1161462. pmid:37179777
  35. 35. Bergen K, Betz K, Welte W, Diederichs K, Marx A. Structures of KOD and 9°N DNA Polymerases Complexed with Primer Template Duplex. ChemBioChem 2013;14:1058–62. pmid:23733496
  36. 36. Kennedy EM, Hergott C, Dewhurst S, Kim B. The Mechanistic Architecture of Thermostable Pyrococcus furiosus Family B DNA Polymerase Motif A and Its Interaction with the dNTP Substrate. Biochemistry 2009;48:11161–8. pmid:19817489
  37. 37. Pinheiro VB. Engineering-driven biological insights into DNA polymerase mechanism. Curr Opin Biotechnol 2019;60:9–16.
  38. 38. Hashimoto H, Nishioka M, Fujiwara S, Takagi M, Imanaka T, Inoue T, et al. Crystal structure of DNA polymerase from hyperthermophilic archaeon Pyrococcus kodakaraensis KOD1 11Edited by Huber R. J Mol Biol 2001;306:469–77. pmid:11178906
  39. 39. Fujiwara S, Takagi M, Imanaka T. Archaeon Pyrococcus kodakaraensis KOD1: application and evolution. Biotechnol. Annu. Rev., vol. 4, Elsevier; 1998, p. 259–84. pmid:9890143
  40. 40. Moreira IS, Fernandes PA, Ramos MJ. Computational alanine scanning mutagenesis—An improved methodological approach. J Comput Chem 2007;28:644–54. pmid:17195156
  41. 41. A site-directed mutagenesis method particularly useful for creating otherwise difficult-to-make mutants and alanine scanning | Elsevier Enhanced Reader n.d.
  42. 42. Brown JA, Suo Z. Unlocking the Sugar “Steric Gate” of DNA Polymerases. Biochemistry 2011;50:1135–42. pmid:21226515
  43. 43. Gardner A. Determinants of nucleotide sugar recognition in an archaeon DNA polymerase. Nucleic Acids Res 1999;27:2545–53. pmid:10352184
  44. 44. Gardner AF, Jackson KM, Boyle MM, Buss JA, Potapov V, Gehring AM, et al. Therminator DNA Polymerase: Modified Nucleotides and Unnatural Substrates. Front Mol Biosci 2019;6:28.
  45. 45. Raper AT, Reed AJ, Suo Z. Kinetic Mechanism of DNA Polymerases: Contributions of Conformational Dynamics and a Third Divalent Metal Ion. Chem Rev 2018;118:6000–25. pmid:29863852
  46. 46. O’Flaherty DK, Guengerich FP. Steady-State Kinetic Analysis of DNA Polymerase Single-Nucleotide Incorporation Products. Curr Protoc Nucleic Acid Chem 2014;59. pmid:25501593
  47. 47. Blasco MA, Méndez J, Lázaro JM, Blanco L, Salas M. Primer Terminus Stabilization at the φ29 DNA Polymerase Active Site. J Biol Chem 1995;270:2735–40. pmid:7852344
  48. 48. Leconte AM, Patel MP, Sass LE, McInerney P, Jarosz M, Kung L, et al. Directed Evolution of DNA Polymerases for Next-Generation Sequencing. Angew Chem 2010;122:6057–60.
  49. 49. Huang J, Liang X, Xuan Y, Geng C, Li Y, Lu H, et al. A reference human genome dataset of the BGISEQ-500 sequencer. GigaScience 2017;6. pmid:28379488
  50. 50. Lang J, Zhu R, Sun X, Zhu S, Li T, Shi X, et al. Evaluation of the MGISEQ-2000 Sequencing Platform for Illumina Target Capture Sequencing Libraries. Front Genet 2021;12:730519. pmid:34777467
  51. 51. Drmanac S, Callow M, Chen L, Zhou P, Eckhardt L, Xu C, et al. CoolMPS: Advanced massively parallel sequencing using antibodies specific to each natural nucleobase. Genomics; 2020.
  52. 52. Korostin D, Kulemin N, Naumov V, Belova V, Kwon D, Gorbachev A. Comparative analysis of novel MGISEQ-2000 sequencing platform vs Illumina HiSeq 2500 for whole-genome sequencing. PLOS ONE 2020;15:e0230301. pmid:32176719
  53. 53. Tiacci L. Buffer allocation vs. sequencing optimization: which of the two is most effective to improve the efficiency of assembly lines? IFAC-Pap 2022;55:452–7.
  54. 54. Huang J, Liang X, Xuan Y, Geng C, Li Y, Lu H, et al. BGISEQ-500 Sequencing v1. 2018. https://doi.org/10.17504/protocols.io.pq7dmzn.
  55. 55. Zavgorodny S, Polianski M, Besidsky E, Kriukov V, Sanin A, Pokrovskaya M, et al. 1-alkylthioalkylation of nucleoside hydroxyl functions and its synthetic applications: a new versatile method in nucleoside chemistry. Tetrahedron Lett 1991;32:7593–6.
  56. 56. Drmanac R, Drmanac S, Li H, Xu X, Callow MJ, ECKHARDT L, et al. Stepwise sequencing by non-labeled reversible terminators or natural nucleotides. Google Patents; 2018.
  57. 57. Yudkina AV, Zharkov DO. Miscoding and DNA Polymerase Stalling by Methoxyamine-Adducted Abasic Sites. Chem Res Toxicol 2022;35:303–14. pmid:35089032
  58. 58. Endutkin AV, Yudkina AV, Zharkov TD, Kim DV, Zharkov DO. Recognition of a Clickable Abasic Site Analog by DNA Polymerases and DNA Repair Enzymes. Int J Mol Sci 2022;23:13353. pmid:36362137
  59. 59. Tabor S, Richardson CC. Effect of manganese ions on the incorporation of dideoxynucleotides by bacteriophage T7 DNA polymerase and Escherichia coli DNA polymerase I. Proc Natl Acad Sci 1989;86:4076–80. pmid:2657738
  60. 60. Kennedy EM, Hergott C, Dewhurst S, Kim B. The Mechanistic Architecture of Thermostable Pyrococcus furiosus Family B DNA Polymerase Motif A and Its Interaction with the dNTP Substrate. Biochemistry 2009;48:11161–8.
  61. 61. Kielkopf CL, Bauer W, Urbatsch IL. Bradford Assay for Determining Protein Concentration. Cold Spring Harb Protoc 2020;2020:pdb.prot102269. pmid:32238597
  62. 62. Genheden S, Ryde U. The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities. Expert Opin Drug Discov 2015;10:449–61. pmid:25835573
  63. 63. Poli G, Granchi C, Rizzolio F, Tuccinardi T. Application of MM-PBSA Methods in Virtual Screening. Molecules 2020;25:1971. pmid:32340232
  64. 64. Guo X, Chen F, Gao F, Li L, Liu K, You L, et al. CNSA: a data repository for archiving omics data. Database 2020;2020:baaa055. pmid:32705130
  65. 65. Chen, F.Z., You, L.J., Yang, F., Wang, L.N., Guo, X.Q., Gao, F., Hua, C., Tan, C., Fang, L., Shan, R.Q., et al. (2020). CNGBdb: China National GeneBank DataBase. Yi Chuan 42, 799–809.