Covalent Bonding of Pyrrolobenzodiazepines (PBDs) to Terminal Guanine Residues within Duplex and Hairpin DNA Fragments

Pyrrolobenzodiazepines (PBDs) are covalent-binding DNA-interactive agents with growing importance as payloads in Antibody Drug Conjugates (ADCs). Until now, PBDs were thought to covalently bond to C2-NH2 groups of guanines in the DNA-minor groove across a three-base-pair recognition sequence. Using HPLC/MS methodology with designed hairpin and duplex oligonucleotides, we have now demonstrated that the PBD Dimer SJG-136 and the C8-conjugated PBD Monomer GWL-78 can covalently bond to a terminal guanine of DNA, with the PBD skeleton spanning only two base pairs. Control experiments with the non-C8-conjugated anthramycin along with molecular dynamics simulations suggest that the C8-substituent of a PBD Monomer, or one-half of a PBD Dimer, may provide stability for the adduct. This observation highlights the importance of PBD C8-substituents, and also suggests that PBDs may bind to terminal guanines within stretches of DNA in cells, thus representing a potentially novel mechanism of action at the end of DNA strand breaks.


Introduction
The pyrrolo [2,1-c] [1,4]benzodiazepines (PBDs) are a family of sequence-selective DNA minor-groove binding agents [1][2][3][4][5] which are growing in importance due to their use as payloads in Antibody-Drug Conjugates (ADCs) [6][7][8]. The naturally occurring PBDs produced by Streptomyces and Micrococcus species are monomeric (e.g., anthramycin 1, Fig 1) and form singly-alkylated DNA-adducts, whereas the synthetic PBD Dimers consist of two PBD units joined through a C8/C8 0 -linker and can form interstrand or intrastrand DNA cross-links in addition to mono-adducts [9][10][11][12]. One PBD Dimer, SJG-136 (2, Fig 1), successfully completed Phase I clinical trials [13][14][15], and reached Phase II evaluation in ovarian and hematological cancers. The DNA-binding affinity and cytotoxicity of PBD Monomers has been enhanced by attaching heterocyclic units to the C8-position (e.g., GWL-78, 3, Fig 1), and molecules of this type have been shown to inhibit the binding of certain transcription factors to their consensus tumour cells, with the repair response dependent on cell type, and the extent and duration of exposure to the agent [37]. Tumour cells are often deficient in one or more relevant DNA repair pathways, thus leading to selective cytotoxicity and an in vivo antitumour effect [38,39].
The initial objective of this study was to investigate the interaction of the monomeric PBDs anthramycin (1) and , and the PBD Dimer SJG-136 (2), with transcription factor recognition sequences, and to identify the reactive guanine(s) within these sequences. For this purpose we obtained synthetic transcription factor recognition sequences with guanines located at different positions within DNA hairpin and duplex structures. These guanines were consecutively replaced with inosine bases to remove the nucleophilic C2-NH 2 functionalities to prevent covalent interaction. Through this process, we made the surprising observation that 2 and 3 can covalently bond to terminal guanine residues, a previously unobserved phenomenon. This has possible consequences for the biological mechanism of action of C8-substituted PBD Monomers and Dimers, as it means that they may be able to bind to the ends of strand breaks in the DNA of cells.

Single-Stranded and Hairpin Oligonucleotides
Single-stranded (SS) and hairpin oligonucleotides were purchased from Eurogentec (Southampton, UK) in lyophilized form. They were annealed to form double-stranded (DS) DNA according to the procedures described below.

Annealed Double-Stranded Oligonucleotides
Each single-stranded oligonucleotide was dissolved in 1:1 of annealing buffer (10 mM Tris-HCl pH 8.5/50 mM sodium chloride/1mM EDTA) and 100 mM ammonium acetate to form stock solutions of 1 mM. Solutions of DS DNA were prepared by heating to 85°C for 10 mins in a heating/cooling block (Grant Bio, UK), and then allowed to cool to room temperature followed by storage at -20°C overnight to ensure completion of the annealing process. Working solutions of double-stranded oligonucleotides of 25 μM were prepared by diluting stored stock solutions with 20 mM ammonium acetate (Sigma-Aldrich, UK) followed by storage at -20°C.

Annealed Hairpin Oligonucleotides
Each hairpin-forming single-stranded oligonucleotide was dissolved in annealing buffer (10 mM Tris-HCl pH 8.5/50 mM sodium chloride/1 mM EDTA in a 1:1 ratio) to form stock solutions of 1 mM. For hairpin formation, solutions were heated to 85°C for 10 min in a heating/ cooling block (Grant Bio, UK), and then allowed to cool slowly to room temperature followed by storage at -20°C overnight for completion of the annealing process. Working solutions of hairpin oligonucleotides of 25 μM were prepared by diluting stored stock solutions with 100 mM ammonium acetate (Sigma-Aldrich, UK) followed by storage at -20°C. Hairpin oligonucleotides were designed (i.e., 19 bases in length) to ensure the presence of a DNA minor-groove environment necessary for covalent binding of a PBD, rather than for any particular biological relevance.
Fisher Scientific, UK), and these were stored at -20°C for not more than two weeks, and thawed to room temperature when required.

Preparation of PBD/DNA Complexes
PBD/DNA complexes were prepared by adding one of the PBD working solutions (100 μM) to an annealed hairpin or double-stranded oligonucleotide working solution (25 μM) in a 4:1 ratio (PBD/DNA). The mixture was agitated for 5-10 seconds using a vortex mixer, and then incubated for 24 hours at 25°C before subjecting to ion-pair reversed-phase HPLC and mass spectrometry analysis.

Ion-Pair Reversed-Phase HPLC Analysis
Liquid chromatography was performed on a Thermo Scientific UltiMate 3000 system equipped with a 2.1 x 50 mm XBridge™ OST C18 column packed with 2.5 μm particles (Waters Ltd., UK), using Chromeleon 7 software (Version 7.1.1.1127). The gradient system used for LC analysis consisted of 100 mM triethylammonium bicarbonate (TEAB) as Buffer A and 40% acetonitrile (ACN) in water (HPLC grade, Fisher Scientific, UK) as Buffer B. For Buffer A, a 1 M pre-formulated TEAB solution was purchased from Sigma-Aldrich (UK), and diluted to the required concentration with HPLC grade water. The gradient was ramped from 90% A at 0 min to 55% A at 18 min, then to 20% A at 22 min, and finally to 10% A at 23.5 min. UV absorbance was monitored at 254 nm.

Mass Spectrometric Analysis (LC-ESI-MS/MS)
Chromatographic separation of DNA and SJG-136/DNA adduct complexes was performed on a Agilent 1200 HPLC system utilising a XBridge 50 x 2.1 mm C18 column packed with 2.5 μm particles (Waters, UK). Mobile phase A consisted of 50 mM TEAB in water pH 7.4, and mobile phase B consisted of 40% acetonitrile in water. After 1 min at 10% B, the proportion of B was raised to 80% over the next 18 min, followed by an increase to 100% B over the next 4 min, maintained at 100% B for a further 1.5 min, before returning to 10% B in 6 seconds and reequilibration for a further 3 min 54 seconds, providing a total run time of 28 min. The flow rate was maintained at 300 μL/min and the eluent directed to the Dual-ESI source of an Agilent 6510 Q-TOF mass spectrometer. Ions from the Dual-ESI source, operated in the negative ESI mode, were transmitted to the pusher of the Q-TOF in either MS or MS/MS mode. The full mass range was set from m/z 300 to 5,000 for MS mode with an acquisition rate of 0.58 and time of 1716.8 corresponding to transient/spectrum of 9652.
For LC-MS/MS analysis of the Hairpin oligonucleotides, each sample (25 μM in 100 mM ammonium acetate) was injected (20 μL) onto the reversed-phase column and eluted into the Q-TOF at a flow-rate of 300 μL/min. Dual-ESI parameters were as follows: gas temperature, 350°C; drying gas, 10 L/min; nebuliser, 50 psig; VCap, 3500; fragmentor, 20 V; skimmer, 65 and OCT1 RFVpp, 750. The acquisition rate was 9652 transients/spectrum for MS and MS/MS mode. Mass spectra were acquired in profile and centroid mode. The Q-TOF was calibrated continuously using a lock-mass calibration mixture during analytical runs, and mass accuracy was in all cases better than 2 ppm. The Q-TOF was operated at a resolution of 30,000.
For the analysis of SJG-136/DNA adduct samples, 20 μL of the reaction mixture was injected onto the C18 column. MS/MS spectra were recorded in a targeted MS/MS acquisition mode with the mass range set from m/z 100 to 4000, with an acquisition rate of 0.58 and a time of 1716.8 corresponding to a transient/spectrum of 9652. The targeted list was set for [M-3H] 3ions corresponding to DNA or SJG-136/DNA determined from the full MS mass spectra. For the acquisition of MS/MS spectra, the collision energy was set to 20, 30 and 40 to determine the maximum number of MS/MS fragments. The MS/MS isolation width was set to medium 4 m/z to allow the selection of mono-isotopic precursor ions.

Molecular Modelling and Molecular Dynamics Simulations
Hairpin and duplex DNA sequences used in this study (including the TTT-loop regions) were constructed using the AMBER module nab [40]. When present, the TTT-loop was covalently linked to the backbone of the DNA using xleap, and parameters derived in-house. Inosine-containing sequences were created through the deletion of the exocyclic amine groups of guanine residues. Anthramycin (1), SJG-136 (2) and GWL-78 (3) were docked in the minor groove using the AMBER module xleap, in which parm99SB, modified parmbsc0 [41] and Gaff AMBER force field parameters were loaded. Antechamber was used to construct.mol2 files through the addition of Gasteiger charges, and missing parameters were generated using parmchk. A covalent bond was generated between the exocyclic amine groups of selected guanines (guided by molecular mechanics calculations [42]), to form either mono-alkylated or inter/intra-strand cross-linked adducts. Energy minimisation was then undertaken in a gradient manner by initially placing the DNA under a high force constraint (i.e., 500 kcal mol -1 Angstrom -2 ), which was then reduced in stages to zero to enable the PBD molecule to find its local energy minimum, followed by reduction in force in a periodic manner with a relaxation of restraints. Production simulations in implicit solvent (GBSA) were run for a period of 10 ns, and atomic coordinates were saved at 1 ps intervals. A ptraj script was used to ascertain the lowest potential energy derived during adduct simulations, and extended simulations (50 ns in duration) were also undertaken on selected ligand:DNA adducts to show that increasing simulation time did not have any effect on differences in potential energies (data presented in S27  Fig). Hydrogen-bonding analysis of molecular dynamics simulations was undertaken using VMD [43]. All models were created using Chimera [44].

Results
During a study of the interaction of SJG-136 (2) with various DNA transcription factor recognition sequences [33], we observed that it rapidly formed an adduct with the AP-1 hairpin DNA sequence (Hairpin-1, Fig 2), with a major new peak appearing in the chromatogram at retention time (RT) 7.53 min after 24 hours of incubation, and with reaction complete by 24 hours (S1A and S1B Fig). The stoichiometry of the adduct was confirmed as 1:1 SJG-136/Hairpin-1 by MALDI-TOF-MS with an observed mass of 6351.2 m/z (theoretical mass: 6350.41 m/z) (S1C Fig). Given that we had previously shown [45] that 2 can form mono-adducts, and intra-and interstrand cross-links of different lengths according to the following rank order of preference: Pu-GAATG-Py > Pu-GATC-Py >> Pu-GATG-Py and Pu-GAATC-Py [9], we initially assumed that the adduct formed was most likely to be either the extended G7-G17 3'-GTAAC-5' interstrand cross-link, a G7 or G17 mono-alkylated adduct, or a combination of one or more of these (Fig 3). In order to investigate this further, we designed the DNA Hairpins Structures of Hairpins 1-9 used in this study. Hairpin-1, the AP-1 transcription factor recognition sequence; Hairpins 2-4, the same AP-1 sequence but with two of the four guanines replaced with inosines in each case to study cross-linking; Hairpins 5-8, the same AP-1 sequence but with three of the four guanines replaced with inosines in each case to study mono-alkylation; Hairpin 9, the same AP-1 sequence but with three of the four guanine bases (except the 5'-terminal-guanine) mutated to an A.  2-4 ( Fig 2) which are based on the same AP-1 sequence but with each hairpin having two of its four guanines replaced with non-nucleophilic inosines (i.e., guanine residues without nucleophilic C2-NH 2 groups), still allowing the possibility of interstrand cross-link formation. We also designed Hairpins 5-8 which have three of their four guanines replaced with inosines, thus leaving only one reactive guanine (i.e., G1, G7, G12 or G17) in each case, to study monoalkylation.
Interaction of SJG-136 (2) with the Inosine-Modified AP-1 Hairpin Oligonucleotides (Hairpins 2-8) The substitution patterns of Hairpins 2-4 (Fig 2) allowed the possibility of interstrand crosslinking at G7 and G17 (Hairpin-2), G1 and G12 (Hairpin-3) and G1 and G17 (Hairpin-4). HPLC analysis of annealed Hairpins-2-4, showed single peaks for the hairpin sequences with retention times ranging from 6.78 minutes to 7.44 min ( Table 1). The hairpin sequences provided correct m/z values within a narrow range of 5764.1 to 5764.5 by MALDI-TOF-MS (Table 1). Hairpins-2-4 were incubated with 2 at 25°C for 24 hours in a 4:1 ratio, and new adduct peaks were observed at retention times 8.40 min, 8.08 min and 8.04 min, respectively ( Table 1). In the case of Hairpin-2, a complete disappearance of the hairpin DNA peak was observed (S2A and S2B Fig Table 1). The formation of an adduct with Hairpin-3 was a surprising result, as a PBD had not previously been observed to bond to a terminal guanine (i.e., G1), and G12 was considered to be too close to the TTT-loop to covalently bond to a PBD. Furthermore, an interstrand cross-link between G1 and G12 would be too long to form. Therefore, our conclusion was that, for Hairpin-3, 2 was bonding to the terminal G1 via one PBD unit, with the second PBD unit pointing towards the TTT-loop (Fig 3).
In order to study mono-adduct distribution using the same HPLC/MS methodology, a further four AP-1 hairpins (Hairpin 5-8) were designed, based on the same AP-1 sequence but with a different three of the four guanines replaced with inosines in each case to leave just one reactive guanine (i.e., G1, G7, G12 or G17 in Hairpins 5, 6, 7 and 8, respectively) (Fig 2). HPLC analysis of annealed Hairpin-5, with only the terminal guanine (i.e., G1) available gave a single peak at RT 7.64 min ( Fig 4A) which provided the correct m/z (5748.30) for this oligonucleotide by MALDI-TOF-MS. Following a 24 hour incubation with 2 at 25°C in a 4:1 molar ratio (2/ DNA), a new peak appeared in the HPLC chromatogram at RT 14.03 min ( Fig 4B). The adduct stoichiometry was confirmed by MALDI-TOF-MS as 1:1 2/Hairpin-5 based on an observed  4C). Although the reaction had not gone to completion by 24 hours (i.e., approximately 50% reacted), this result confirmed that a PBD unit could bond to a terminal G1 base, with the body of the covalently-attached PBD spanning two rather than three base pairs. Similar HPLC analysis of Hairpins 6 to 8 (Fig 2) showed single peaks for the hairpin sequences with retention times ranging from 8.01 to 8.14 minutes (Table 1, (Table 1). Incubation of 2 at 25°C for 24 hours in a 4:1 ratio with these hairpin sequences gave new adduct peaks at retention times 9.24, 9.07 and 8.86 minutes, respectively (Table 1, S5B, S6B and S7B Figs). In all cases the 1:1 stoichiometry of the adduct (2/DNA) was confirmed by MALDI--TOF-MS (Table 1, S5C, S6C and S7C Figs). Interestingly, significant differences in the extent of adduct formation were observed with these inosine-modified sequences (Table 1). In the case of Hairpin-6, the adduct was thought to be a mono-alkylated adduct with 2 bonded to G7, and with the bulk of the PBD Dimer lying in the minor groove pointing away from the loop. The approximately 50% conversion within 24 hours was assumed to reflect the lower stability of a mono-rather than a cross-linked adduct. In the case of Hairpin-7, the reaction was approximately 80% complete, also consistent with a lower rate of reaction for mono-adduct formation. This result was also surprising due to the proximity of G12 to the loop structure. Molecular Dynamics Simulations (see later) suggested that, although located close to the TTTloop, a 2:G12 mono-alkylated adduct is feasible, with the second PBD unit of the dimer orientated away from the loop and forming non-covalent interactions in the minor groove. Finally, reaction with Hairpin-8 was 70% complete after 24 hours with one of the PBD units presumably bonding to G17 with the bulk of the molecule lying in the minor groove pointing toward the loop.
The reactivity of G1 was particularly surprising, as it means that the covalently bonded PBD unit must span only two base pairs, a phenomenon not previously observed or thought to be possible. Previous studies on the DNA binding characteristics of PBDs have suggested that a minimum of three consecutive DNA bases with a central covalently-reacting guanine (i.e., Pu-G-Pu) are required for covalent attachment [20]. To investigate this further we carried out an LC-MS/MS analysis of the 2/Hairpin-5 adduct. A base peak ESI-MS chromatogram of Hairpin-5 is shown in S8 Fig, and  In summary, the results from the experiments with Hairpins 5-8 demonstrated that all four guanines (i.e., G1, G7, G12 or G17) are reactive toward 2, and are capable of forming monoalkylated adducts.

Interaction of Anthramycin (1) and GWL-78 (3) with the Inosine-Modified AP-1 Hairpin Sequences
To determine whether the size of the PBD molecule might affect the mode of binding to the AP-1 hairpins, similar experiments were carried out with anthramycin (1) and the PBD C8-conjugate GWL-78 (3). First, 1 was incubated with Hairpin-5 (with only the 5'-terminal guanine available for covalent binding) in a 4:1 ratio (1/DNA) for 24 hours at 25°C. However, no changes were observed in the HPLC chromatogram, with the unreacted Hairpin-5 peak at RT 7.90 min remaining the major peak (S17A and S17B Fig) as confirmed by MALDI-TOF-MS (observed mass: 5748.0 m/z, theoretical mass: 5748.8 m/z) (S17C Fig). This suggested that substitution at the C8-position, as in SJG-136 (2) may have a significant influence on the binding properties of a PBD by enhancing interaction of the molecule in the DNA minor groove through non-covalent interactions (e.g., van der Waals, hydrogen bonds and/or electrostatic interactions).
The next step was to study GWL-78 (3), a synthetic PBD Monomer substituted at the C8-position with a bis-pyrrole side-chain that allows the molecule to span a total of five DNA base pairs. It was anticipated that this molecule may behave more like SJG-136 (2) than anthramycin (1) due to the large non-covalent DNA-binding moiety at the C8-position. As predicted, From these results it was concluded that covalent bonding of a PBD to a terminal guanine may only occur when there is sufficient molecular bulk at the C8-position (e.g., another PBD unit in the case of SJG-136, or a bis-pyrrole unit in the case of GWL-78) to provide stabilization to the adduct by interacting non-covalently in the adjacent minor groove.

Interaction of Anthramycin (1), SJG-136 (2) and GWL-78 (3) with Hairpin-9
In order to rule out whether these new observations of a PBD Monomer (3) and Dimer (2) bonding to a terminal guanine might be due to a change in DNA conformation caused by the insertion of inosine bases, the interaction of 1, 2, and 3 with Hairpin-9 (Fig 2) was studied. This hairpin contained a 5'-terminal guanine without any inosine modifications. Instead, G7, G12 and G17 were replaced with adenine bases, and their corresponding cytosines with thymines. Hairpin-9 was incubated with 1, 2, and 3 in a 4:1 ratio (PBD/DNA) for 24 hours at 25°C, and subjected to HPLC and MALDI-TOF-MS analysis. Hairpin-9 alone gave a single peak at RT 9.56 min ( Fig 6A) identified by MALDI-TOF-MS. After reaction with 2, one new major peak at RT 14.19 min appeared along with a minor peak at RT 18.4 min (Fig 6B). The stoichiometry of the main new peak as 1:1 2/Hairpin-9 was confirmed by MALDI-TOF-MS analysis with an observed mass of 6347.9 m/z (theoretical mass: 6347.5 m/z) (Fig 6E). Incubation of 3 with Hairpin-9 gave one new major peak at RT 12.07 min and a minor peak at RT 18.81 min (Fig 6C). Reaction was not complete after 24 hours with some Hairpin-9 remaining (Fig 6C). The stoichiometry of the main adduct formed was confirmed by MALDI-TOF-MS analysis as 1:1 3/Hairpin-9 with a mass of 6381.2 m/z (theoretical mass: 6381.5 m/z) (Fig 6F). The identity of the minor peaks at RT 18.40 min (for 2) and RT 18.81 min (for 3) could not be confirmed. Lastly, HPLC analysis of the interaction of 1 with Hairpin-9 provided no changes in the chromatogram (Fig 6D), with MALDI-TOF-MS (Fig 6G) confirming that no adduct had formed.
In summary, these results supported the observations made with Hairpin-5, and confirmed that the inosine mutations had not altered the reactivity of the DNA toward the PBDs.

Interaction of SJG-136 (2) with Inosine-Modified Duplex AP-1 Oligonucleotides
In order to explore whether the observations made within the inosine-modified AP-1 hairpins (Fig 2) were a consequence of their hairpin structure, a similar series of experiments was carried out on the sequence-related DNA duplexes (S18 Fig). As with the AP-1 hairpin oligonucleotides 5-8 (Fig 2), three of the four guanines available for covalent attachment of a PBD were replaced with non-nucleophilic inosines resulting in only one guanine available for covalent bonding in each case. Each duplex oligonucleotide had a length of 8 base pairs, as it has been previously demonstrated by this laboratory that a minimum of 7 base pairs is required to ensure a DNA minor-groove environment suitable for PBD binding (unpublished data).
Annealed Duplex-1 (Seq-1/Seq-2) (S18 Fig) gave two peaks in the HPLC chromatogram at RT 7.16 min (Seq-2) and RT 7.46 min (Seq-1) (Fig 7A), identified as the single strands of   (Fig 7C). This was in accord with previous reports that double-stranded oligonucleotides of this length denature under these HPLC conditions [46]. Duplex-1 was then incubated with 2 in a 4:1 ratio (2/DNA) for 24 hours at 25°C, and subjected to HPLC analysis. A new minor peak appeared at RT 19.30 min (Fig 7B) suggesting that an adduct had formed. Reaction was incomplete after 24 hours, with peaks corresponding to Seq-2 (RT 7.20 min; previously 7.16 mins) and Seq-1 (RT 7.52 min; previously 7.46 mins) remaining. Subsequent MALDI-TOF-MS analysis confirmed the presence of a 1:1 2/Seq-1 adduct with an observed mass of 2951.8 m/z (theoretical mass: 2951.21 m/z) (Fig 7C). It is noteworthy that in a similar manner to the denaturation that occurs during the HPLC process, under MALDI-TOF-MS conditions the duplex denatured into single strands with the PBD still attached to the guanine-containing strand. This observation was consistent with the data obtained for Hairpin-5, as Duplex-1 contains only one available guanine at its 5'-terminus. This further confirmed the ability of 2 to interact with a terminal guanine.
Next, Duplex-2 (S18 In summary, the results obtained from the interaction of 2 with Duplexes 1-4 (S18 Fig) were consistent with those obtained for Hairpins 5-8 (Fig 2). Furthermore, Duplex-1 and Duplex-3 both contain a guanine at the 5'-terminus, and 2 was able to react with both.

Interaction of Anthramycin (1) and GWL-78 (3) with the Double-Stranded Inosine Modified AP-1 Sequence
Based on the data obtained for Hairpin-5, anthramycin (1) was not expected to react with Duplex-1 as it lacks a substituent at the C8-position that would help stabilize its accommodation in the DNA minor groove. In contrast, 3 was expected to react due to its C8-bis-pyrrole

Molecular Dynamics Simulations
Molecular dynamics simulations were carried out to try to predict the most preferred reacting guanine(s) in the AP-1 hairpin sequence (Fig 8). In these simulations, 2 was covalently bound to every potentially reacting guanine base (i.e., G1, G7, G12 and G17) in an effort to rationalize the structures of the adducts formed in the HPLC studies. As 2 is known to form monoadducts and inter-and intrastrand cross-links [9,33], all of these adduct types were investigated and, in the case of mono-adducts, both loop-facing (i.e., forward) and non-loop-facing (i.e., reverse) orientations were considered. The study design also considered potential reaction sites based on the span of the molecule. For example, in the case of G7 and G12, mono-adducts with the bulk of the molecule pointing toward the loop were not analyzed due to potential steric hindrance with the TTT-loop. However, adducts at G17 in both orientations were considered despite the fact that the significant bulk of a 3'-oriented adduct would be positioned outside of the minor groove environment.
For possible mono-adducts, the potential energy calculations suggested that G7 is the preferred reacting guanine (S23A Fig)   (in the forward orientation) > G1 > G12 > G17 (in the reverse orientation). This ranking was broadly consistent with the observations from the inosine replacement experiments which suggested that G7 and G17 are favoured for covalent interaction based on the faster reaction rates of Hairpins 1, 2, 4, 6 and 8, and Duplexes 2 and 4. Conversely, potential energy calculations predicted that less favorable binding sites would include G12 where the TTT-loop is likely to inhibit interaction of 2, or G17 (in the reverse orientation) in which case the second PBD unit of the dimer would be positioned outside of the groove where it would not be able to form non-covalent interactions with DNA bases in the minor groove, and would not be wellsolvated due to its lipophilicity.
It is noteworthy that potential energy calculations for the bonding of 2 to G1, both in monoadduct and cross-linking modes, suggest that it is viable as a reacting guanine. G1 adducts were observed by HPLC for the reaction of both 2 and 3 with 5'-GACATTIC-TTT-ICAATITC-3' (Hairpin-5) where G1 is the only available guanine for reaction, and also in the sequence 5'-GATATTAT-TTT-ATAATATC-3' (Hairpin-9) which is devoid of other reactive guanines. However, when anthramycin (1) was reacted with the same sequences, no adducts were formed. Although molecular dynamics simulations of 1 covalently bound to G1 suggested that an adduct in the C11-S configuration is theoretically possible, it is likely that the lack of non-covalent interactions from a C8-substituent, as available in the case of 2 and 3, leads to low overall stabilization of the adduct. Further evidence for this was obtained through molecular dynamics simulations of 2 and 3 bound non-covalently to Hairpin-5, where non-covalent interactions from the C8-pyrrole chain (in the case of 3) and the non-alkylating PBD unit (in the case of 2), oriented the alkylating PBD component over G1 for covalent attack. However, in the case of 1 which does not have a C8-side chain, it moved up and down the minor groove during the simulation and failed to align over G1 to allow for attack by the C2-amino group.
Simulations of 2, 3 and 1 covalently bound to G1 of 5'-GACATTIC-TTT-ICAATITC-3' (Hairpin-5) were also undertaken with the ligands pointing away from the loop (S24, S25 and S26 Figs), to investigate the orientation of the G1 adducts. Potential energy calculations of 2 and 3 in both orientations suggested a potential energy difference of between 40 kcal/mol and 43 kcal/mol, respectively, in favour of the loop-facing orientation. This most likely reflected the unfavorable energetics when the bulk of the molecules point out of the DNA helix into a water environment rather than laying in the minor groove where they can provide adduct stabilization through non-bonding interactions. As anticipated, the energy difference between the two possible orientations of 1 covalently bonded to G1 was significantly less (i.e., 2.5 kcal/mol; -2736.10 kcal/mol versus -2733.60 kcal/mol) presumably due to the lack of a C8-substituent that can provide adduct stabilization through non-covalent interactions.

Discussion
The results presented here initially demonstrated that PBDs such as 1, 2 and 3 can covalently bind to the consensus DNA sequence for the oncogenic transcription factor AP-1. After this, inosine-modified AP-1 duplexes and hairpins were used to study the potential of the PBDs to form mono-alkylated and inter-and intrastrand cross-links at the various guanines within the duplex and hairpin sequences.
On the basis of results from previous studies [45], it was initially assumed that the adduct formed between SGJ-136 (2) and AP-1 (Hairpin-1) would be the G7/G17 interstrand crosslink (Fig 3), even though it is an extended interstrand cross-link (i.e., 3'-GTAAC-5') compared to the most frequently reported interstrand cross-link, 5'-GATC-3'. The HPLC/MS experiments demonstrated that this was the case, and the extent of reaction of 2 with Hairpin-2 (where only G7 and G17 are available for covalent attack) after 24 hours was similar to that for Hairpin-1 (i.e., 100% adduct formation in 24 hours). Molecular modelling studies and potential energy calculations confirmed the feasibility of this extended adduct, although suggested that some distortion of the helix may result.
Other potential interstrand cross-links were then investigated using Hairpin-3 and Hairpin-4 (i.e., G1-G12 and G1-G17 cross-links, respectively). Although adducts formed with these hairpins, the extent of reaction was much lower than for the parent AP1 hairpin, suggesting that the G1-G12 and G1-G17 cross-linked adducts are less favourable than the G7-G17 cross-link.
Mono-adduct formation was investigated using Hairpins 5-8 and Duplexes 1-4 which each had only one reactive guanine available for covalent bonding of a PBD. From these experiments it became clear that SJG-136 (2) can covalently bond to a 5'-terminal guanine with one of its PBD units spanning only two base pairs. The most likely explanation for this, supported by the results of molecular modeling, is that one PBD unit of 2 bonds to the terminal 5'-guanine with its C-ring protruding just outside of the minor groove, and with the second PBD unit positioned in the minor groove (pointing toward the loop) stabilized by non-covalent interactions but not forming a second cross-linking covalent bond (Fig 8C). This hypothesis was supported by a molecular dynamics simulation (Fig 8D) which suggested that a stable adduct should result with 2 arranged in this way. The fact that adduct formation did not go to completion after 24 hours is also consistent with a mono-alkylation event in which the PBD is interacting with two rather than the usual three base pairs at the alkylation site. Modeling studies also suggested that the alternative arrangement, with one PBD unit of 2 bound to the terminal guanine but facing out of the minor groove would be unlikely due to the high energy associated with one PBD unit being positioned outside of the minor groove in the solvent environment.
Analogous experiments with the C8-bis-pyrrole-substituted PBD monomer GWL-78 (3) and the non-C8-substituted PBD monomer anthramycin (1) further supported the requirement for a C8-side chain to facilitate terminal-G bonding. For example, an adduct of Hairpin-5 was observed after reaction with 3, but not with 1 (S17 Fig). In a similar manner to 2, it is likely that the C8-sidechain of 3 facilitates formation of the adduct through non-covalent interactions (i.e., hydrogen bond formation, electrostatic forces and van der Waals interactions) with bases in the DNA minor groove. On the other hand, anthramycin (1) fails to react due to the lack of a C8-sidechain. These observations were fully supported by molecular dynamics simulations, where a greater stabilization of DNA was observed for 2 and 3, compared to 1.
As Hairpins 2-8 have two or three of their guanines replaced with inosine bases, we carried out control experiments to exclude the possibility that inosines may modify the conformation of the hairpin or duplex sequences and thus potentiate covalent interaction of PBDs with their G1 bases. However, Hairpin-9, which contains a 5'-terminal guanine but with inosine-cytosine base-pairs replaced with adenine-thymine base pairs, still reacted with SJG-136 (2) and GWL-78 (3) at approximately the same rate whereas the C8-unsubstituted PBD monomer anthramycin (1) failed to react, demonstrating that inosine substitution has no effect on reactivity. These observations were supported by molecular dynamics simulations which showed negligible differences in the propensity of Hairpin-5 or Hairpin-9 to form an adduct with PBDs 2 and 3.
Taken together, these results suggest that C8-substitution of a PBD molecule can facilitate binding to a terminal guanine base. Conversely, anthramycin (1), which lacks a bulky minorgroove-interacting C8-substituent, is unable to interact with a guanine located at the 5'-terminus of a DNA hairpin. In particular, SJG-136 (2) can bond to the 5'-terminal guanine of Hairpin-2 and Duplex-1 because the second PBD unit joined through the C8-O-(CH 2 ) 3 -O-C8' linker can provide stabilization by interacting in the minor groove through hydrogen bond formation, electrostatic forces and van der Waals interactions, even if there is no suitable guanine available for cross-link formation.

Conclusion
This is the first report of PBD molecules covalently bonding to a guanine base located at the end of a DNA hairpin or duplex sequence. To date, PBD units have been reported to span three DNA base pairs with a preference for a 5'-Pu-G-Pu-3' motif, and with the C11-position of the PBD bonding to the central guanine through an aminal bond. In the case of C8/C8'linked dimers, the molecules have been reported to span 6 or 7 base pairs. Apart from adding to knowledge of the variety of PBD/DNA adduct types possible, the results reported here highlight the importance of the C8-sidechain of PBDs in determining whether a particular type of adduct (in this case a terminal guanine adduct) can form. Finally, the observation of G-terminal adducts may have implications for the mechanism of cellular cytotoxicity and in vivo antitumour activity of C8-substituted PBD Monomers and C8-linked PBD Dimers. Until now it was thought that PBDs exert their biological activity by forming mono-alkylated or intra-or interstrand cross-linked adducts within stretches of DNA, thus blocking processes such as transcription factor binding, transcription or enzyme processing, or leading to stalling at the replication fork. However, the observation reported here suggests that a mechanism involving the binding of C8-subtituted PBDs to the ends of DNA double-strand breaks in cells may also be operative. This finding has particular significance for the mechanism of action of the PBDbased Antibody-Drug Conjugates currently in clinical evaluation and at the pre-clinical stage.