Genomic Inverse PCR for Exploration of Ligated Breakpoints (GIPFEL), a New Method to Detect Translocations in Leukemia

Here we present a novel method “Genomic inverse PCR for exploration of ligated breakpoints” (GIPFEL) that allows the sensitive detection of recurrent chromosomal translocations. This technique utilizes limited amounts of DNA as starting material and relies on PCR based quantification of unique DNA sequences that are created by circular ligation of restricted genomic DNA from translocation bearing cells. Because the complete potential breakpoint region is interrogated, a prior knowledge of the individual, specific interchromosomal fusion site is not required. We validated GIPFEL for the five most common gene fusions associated with childhood leukemia (MLL-AF4, MLL-AF9, MLL-ENL, ETV6-RUNX1, and TCF3-PBX1). A workflow of restriction digest, purification, ligation, removal of linear fragments and precipitation enriching for circular DNA was developed. GIPFEL allowed detection of translocation specific signature sequences down to a 10−4 dilution which is close to the theoretical limit. In a blinded proof-of-principle study utilizing DNA from cell lines and 144 children with B-precursor-ALL associated translocations this method was 100% specific with no false positive results. Sensitivity was 83%, 65%, and 24% for t(4;11), t(9;11) and t(11;19) respectively. Translocation t(12;21) was correctly detected in 64% and t(1;19) in 39% of the cases. In contrast to other methods, the characteristics of GIPFEL make it particularly attractive for prospective studies.


Introduction
The realization that certain subtypes of leukemia are invariably associated with recurrent genomic abnormalities was a seminal discovery in leukemia research. This was first recognized in conjunction with chronic myeloid leukemia and the paradigmatic Philadelphia chromosome [1]. Nowadays we know that this is a widespread phenomenon. The determination of genotype has become essential for diagnosis, stratification, treatment planning and prognosis of hematological malignancies. Particularly in infant and childhood leukemia almost half of all diagnosed cases are characterized by the persistent appearance of distinctive chromosomal translocations [2].
Because of the importance of these genetic markers for clinical management a series of methods has been devised that allows the detection of the underlying genetic lesion. Cytogenetics and fluorescent in situ hybridization (FISH) are generally applied to demonstrate the presence and overall structure of genomic alterations. However, both approaches require mitotic cells, cumbersome experimental procedures and experienced operators for success. Alternative methods using archived genetic material have also been developed. Since most translocations create inframe fusion proteins there are only a limited number of exons within both fusion partners that can be joined productively. This fact has been exploited by PCR based methods that use RNA/ cDNA as template [3,4]. In this way the number of primer pairs necessary to interrogate for the presence of a specific translocation is limited and the expected amplification products can be predicted. The drawback is the labile nature of RNA that often precludes successful amplification from stored or aging samples. To avoid this problem DNA based methods have been explored [5,6]. Yet, the actual genomic breakpoints are usually unknown and they are distributed over a large stretch of intronic sequences. This mandates either the use of an unwieldy number of different primer pairs or long range PCR strategies with the disadvantage of non-quantifiable amplicons of unknown length that may well exceed the practicable limits of current PCR.
To avoid these pitfalls, we devised a novel method that can detect chromosomal translocations at the DNA level creating constant, predictable, and quantifiable amplicons. This technique, that we called GIPFEL (genomic inverse PCR for exploration of ligated breakpoints) utilizes the fact that genomic breakpoints are usually confined to defined chromosomal regions. Restriction digest of genomic DNA followed by circularization of resulting fragments will divide even large breakpoint regions into a manageable number of DNA circles. Only cells with translocations will create a ''signature'' circle that is uniquely characteristic for the nature of the underlying genomic aberration (figure 1). These circles can be quantified by real-time PCR because the sequence of the corresponding ligation joint can be derived from the known genomic sequence and the respective location of the restriction sites within the breakpoint region. Hence corresponding amplicons of suitable size for real-time PCR can be designed. Positive amplification results do not only reveal the presence of a translocation but they also give topical information of the approximate localization of the genomic break. By selecting appropriate restriction enzymes even large breakpoint regions can be covered with relatively few primer/PCR reactions. Here we demonstrate proof-of-principle experiments testing GIPFEL on the five most frequent translocations in childhood leukemia t(4;11), t(9;11), t(11;19), t(12;21), and t(1;19).

Circularization of genomic DNA
Genomic DNA from clinical repositories was provided prepurified. Samples were collected with written informed consent and all institutional and national guidelines for employing human material in research were observed. Patients were enrolled in multicenter trial AIEOP-BFM ALL 2000 on treatment of childhood ALL. Diagnosis, characterization and treatment of ALL were performed as previously described [7,8]. The trial was approved by the institutional review board of Hannover Medical School, Hannover, Germany. Written informed consent for the use of specimen for research was obtained from all study individuals, parents or legal guardians and approved by the institutional review board.
All enzymes used in the procedure were obtained from New England Biolabs (Frankfurt/Main, Germany) and used with the appropriate buffers recommended by the manufacturer. For cell lines and buffy coats DNA was prepared from 1 to 5610 6 cells with the QIAampDNA Blood Mini Kit exactly according to the instructions of the manufacturer (Qiagen, Hilden, Germany).
If available, e.g. from cell lines, GIPFEL started with 2.5 mg of DNA corresponding to approximately 3.8610 5 genome equivalents (calculating with 6.6 pg DNA per cell). For detection of translocations in repository DNA, the nucleic acids were either pre-amplified with REPLI-g Ultra Fast Mini Kit according to the manufacturer's (Qiagen) instructions or, when probing for MLL translocations, only 1 mg stored DNA was used directly. The DNA was incubated either with 200 units BamHI-HF (for MLL translocations) or with 200 units of SacI-HF or MfeI-HF for detection of t(12;21) and t(1;19), respectively. Reactions were set up in 100 ml volume using the buffer recommended by the manufacturer and digests were performed for 2 h.
Restriction fragments were isolated by addition of 500 ml buffer PB (Qiagen) to the digestion reaction and a subsequent purification on QIAquick gel extraction columns (Qiagen) according to the instructions of the manufacturer. To improve recovery of longer fragments elution was done with 50 ml of deionized water pre-warmed to 60uC and columns were incubated for 5 minutes at 60uC before final centrifugation.
Religation was performed for 2 h at 24uC in a 100 ml reaction using the total column eluate and 2 ml (800 units) of T4-DNA ligase and the appropriate buffer. After ligation linear DNA fragments were digested by addition of 1 ml (100 units) of exonuclease III and incubation for 30 min at 37uC with a subsequent 5 min heat inactivation at 95uC.
Enriched circular DNA was concentrated by standard alcohol precipitation.

Primer design and semi-nested real time PCR
In silico predictions were done deriving the sequences of all possible ligation junctions that would be created from religation of a genomic fragment carrying a chromosomal breakpoint. Primers spanning ligation sites were designed to generate amplicons suitable for real time PCR (see table 1 and table S1) (https://eu. idtdna.com/analyzer/Applications/OligoAnalyzer/) [9]. To restrict the number of PCRs necessary to include the complete breakpoint region sometimes closely spaced (,1 kb) restriction sites were covered only by a single primer.
All PCR reactions were performed with BrilliantII SYBR green PCR Master Mix from Agilent Technologies (St. Clara, CA, USA) in standard 25 ml reactions using a final primer concentration of 100 nM. For first round PCR 5 ml of circularized DNA corresponding to approximately 1.9610 5 genome equivalents served as template. Cycle conditions were 10 min initial denaturation, followed by 22 cycles of 15 s 95uC, 30 s 64uC, 30 s 72uC for MLL translocations. Translocation t(12;21) and t(1;19) samples were pre-amplified with 25 cycles.
One ml of primary PCR product was used as input for each secondary PCR. Reactions were monitored on an optical cycler   To avoid contamination by airborne DNA, all PCR reactions were assembled under clean-room conditions in an UV-sterilized PCR cabinet with separate equipment and rooms for pre-and post-PCR procedures.

Evaluation of results
A sample was scored as PCR-positive if a primer pair specific for a translocation circle yielded a threshold cycle (C T ) that was clearly decreased compared to the cohort of all other primer pairs. Positive real time products were run on standard agarose gels for determination of size. In addition DNA was isolated from the gel and sequenced from both sides using the PCR amplification primers.
The higher number of primers necessary to cover the t(12;21) and t(1;19) breakpoint region mandated multiplexing also during the second round of PCR. Therefore positively scoring products obtained with a primer pool were re-tested in a third round PCR using single forward primers.

Validation of the GIPFEL procedure
To generate a genomic DNA preparation enriched in circular ligated DNA a 4-step biochemical procedure was developed ( figure 2A). After digestion of genomic DNA and purification of a genome wide population of restriction fragments the nucleic acid was converted to circular form by ligation in a large volume. Remaining linear fragments were removed by digesting with exonuclease III followed by alcohol precipitation to prepare a template for PCR analysis.
PCR was designed in a semi-nested setup (figure 2B) preamplifying with an outer anchor primer (three primers for ETV6) binding to sequences of the 59 fusion portion. This primer was paired with pools of downstream primers corresponding to the predicted 39 fusion sequence. The reaction products of this primary PCR served as input for the next round of PCR. Secondary PCRs were monitored with SYBR green on a real time machine using a 59 inner primer (three primers for ETV6) and either each downstream primer in individual combination (for MLL fusion proteins) or again pools of downstream primers (see Table 1 for primer sequences and Table 2 for multiplexing strategies). Primers amplifying a nearby genomic region unaffected by the translocation were employed alongside as controls. For further evaluation amplified PCR products were sized on agarose gels, isolated and sequenced (figure 2C). A sample was scored positive if the size and the predicted sequence of a PCR product could be unequivocally confirmed (see Table S1 for a list of predicted ligation joint sequences).
To evaluate the efficiency of the overall process we validated the procedure with DNA from three cell lines: MV4;11 carries a t(4;11), REH contains t(12;21) and 697 was used to detect t (1;19). For all lines the exact location of the breakpoint is known obviating the need for multiplexing in the set-up experiments. DNA from cell lines negative for the translocations to be tested (HL60, 697, REH) served as background control. Translocation bearing cells were mixed in various ratios with control cells and the GIPFEL procedure was performed ( figure 3). Under these optimal conditions detection of signature circles was possible for all translocations down to a dilution of 1 into 10 24 . This dilution is equivalent to a calculated presence of 19 target molecules per PCR reaction (2.5 mg DNA = 3.8610 5 cells 610 24 = 38 but because only 50% of the circularization reaction was used as template for PCR, effectively a calculated maximum of 19 template molecules have been present).
To further validate the method on actual patient samples, DNA was obtained from clinical repositories. A collection was assembled encompassing 21 MLL-AF4, 16 MLL-AF9, 18 MLL-ENL, 60 ETV6-RUNX1, and 30 TCF3-PBX1 cases. Five negative control samples were added to each translocation group and the samples were blinded for processing. Because of the limited amount of the + S12f + S15f + S17f ETV6-S1r-n + S2r-n + S3r-n + S12f + S15f + S17f (no anchor) *for MLL fusion proteins multiplexing was done only in the first round of semi-nested PCR. Because GIPFEL also gives topical information of the breakpoint location depending on the primer pair yielding a positive readout, a breakpoint distribution chart could be assembled ( figure 4). As observed previously, chromosomal junction sites were not randomly distributed but clustered in certain areas corresponding to known hotspots of instability giving additional support to the validity of our GIPFEL results [5,[10][11][12][13][14][15][16][17][18][19][20].

Discussion
Here we present a proof-of-principle study demonstrating that it is possible to detect the most commonly occurring translocations in childhood leukemia using small amounts of DNA without having to resort to long range PCR or unstable RNA. The GIPFEL method relies on the prior knowledge of the genomic region where breaks occur. As long as this information is available it can be adapted to any recurrent translocation. At the same time this is also a drawback of the technique. Breaks outside of the pre-defined genomic region will not be detected. Likewise, more complicated genomic rearrangements might elude discovery because they alter the predicted ligation joints. Translocations resulting from more complicated reshuffling of the genome have been described [21]. During our study we serendipitously detected at (11;19) breakpoint where material of chromosome 5 had been interspersed at the junction site of chromosome 11 and 19 (not shown). Events of this type are the most likely explanation for the false negative rate in the present study. In addition the fact that occasionally only one of two closely spaced restriction sites was covered by primer pairs also causes small ''blind spots''. However, compared to the size of most breakpoint regions it is highly improbable that these tiny regions ,1 kb should have a major impact on the sensitivity of the assay.
The biochemical preparation of circular ligated DNA seems to be close to the optimum. Reactions that contained less than 20 calculated template molecules still yielded a positive readout indicating that all previous preparatory steps worked with near perfect efficiency. Therefore the sensitivity of GIPFEL seems to be mainly limited by the amount of total template DNA that can be fed per PCR reaction. This restricts the practical threshold of GIPFEL to about 1 in 10 4 cells which falls in the range of most DNA based methods. We estimate this sensitivity should suffice to discover most clinically meaningful cases.
Another current constraint is the number of PCR reactions that need to be manually assembled to cover a translocation region. However, for this aspect improvements are in sight as new developments like digital droplet PCR should be easily adaptable to GIPFEL allowing the simultaneous screening for multiple translocations in a high-throughput fashion. Despite the fact that t(11;19) and t(1;19) do not read out optimally in our assay, most cases of the much more frequently occurring t(4;11), t(9;11) and particularly t(12;21) will be recorded. In addition actual population based frequencies of the less easily detectable translocations may be extrapolated from the incidence as detected by GIPFEL corrected by the respective accuracy rate. In addition it is to be expected that NGS data from actual breakpoint regions will beome increasingly available. This information will aid in developing better primers for GIPFEL thus increasing precision of this method.
In summary GIPFEL could become a valuable tool particularly in prospective settings. Patients that have been exposed to topoisomerase inhibitors during the treatment of non-blood related neoplastic diseases are at a higher risk developing 11q23 translocation-positive secondary malignancies. Similarly, persons exposed to ionizing radiation might be screened for the appearance of translocation positive clones. Finally, GIPFEL may be used to solve the ongoing scientific discussion about the actual frequency of pre-leukemic events in healthy newborns, who never develop leukemia in later life. For this purpose birth cohorts might be screened for the presence of interchromosomal fusion sequences in apparently healthy newborns. Previous studies gave highly divergent results ranging from 1:100 ETV6-RUNX1 positive cases [22] to less than 1 in 1417 cord blood samples [23,24]. In all these cases GIPFEL may detect the appearance of translocation positive clones allowing for follow up and maybe early treatment.