ARID1A regulates R-loop associated DNA replication stress

ARID1A is a core DNA-binding subunit of the BAF chromatin remodeling complex, and is lost in up to 7% of all cancers. The frequency of ARID1A loss increases in certain cancer types, such as clear cell ovarian carcinoma where ARID1A protein is lost in about 50% of cases. While the impact of ARID1A loss on the function of the BAF chromatin remodeling complexes is likely to drive oncogenic gene expression programs in specific contexts, ARID1A also binds genome stability regulators such as ATR and TOP2. Here we show that ARID1A loss leads to DNA replication stress associated with R-loops and transcription-replication conflicts in human cells. These effects correlate with altered transcription and replication dynamics in ARID1A knockout cells and to reduced TOP2A binding at R-loop sites. Together this work extends mechanisms of replication stress in ARID1A deficient cells with implications for targeting ARID1A deficient cancers.


Author summary
DNA is an incredibly busy molecule. It is bound by an ever-changing array of proteins, which control how our cells read the instructions encoded within DNA, through a process called transcription. DNA also must be replicated, condensed, and segregated every time a cell divides. These processes of DNA replicating and transcribing must not interfere with one another or the cell risks damage to DNA and potentially changes to the DNA code called mutations. In cancer many DNA transactions are perturbed, and this has been associated with damaging collisions between replication and transcription. Here we find that a gene called ARID1A, which is frequently lost in cancer cells, prevents such collisions between replication and transcription machinery. Loss of ARID1A has many effects on the cell, but in this context it seems to change the location and activity of an important regulator of DNA twisting and untangling called Topoisomerase 2A. Understanding how loss of ARID1A creates stresses on dividing cancer cells provides new opportunities to develop or apply therapies that could exploit this stress.

Introduction
RMG1 cell line elicits replication stress. We used CRISPR to remove ARID1A (S1A-S1C Fig), and then examined the phosphorylation of RPA2 on serine 33, a known mark of replication stress, and found higher numbers of RPA2-S33P foci by immunofluorescence in ARID1Adeficient cells (Fig 1A). To further support the presence of ongoing replication stress, we used SIRF (in situ analysis of protein interactions at DNA replication forks) to detect recruitment of Mre11 to nascent EdU (5'ethynyl-2-deoxyuridine) labeled DNA [20]. SIRF uses proximity ligation between a protein of interest and an incorporated EdU pulse label to infer recruitment to nascent DNA. This analysis showed a small but significant increase in Mre11-SIRF signals in ARID1A-deficient compared to proficient cells (Fig 1B), which we and others have previously shown is a symptom of replication stress [21,22]. Together these data suggest that ARID1A knockout cells experience increased replication stress, which recruits MRE11 to nascent DNA.
Given the role of the BAF complex in regulating transcriptional programs, we hypothesized that transcription-replication conflicts (TRCs) might be responsible for some of the observed replication stress. We analyzed the proximity between PCNA and RNA polymerase II as a reporter of TRCs [22][23][24] using a proximity ligation assay and found that ARID1A loss significantly increased such conflicts (Fig 1C). These signals were transcription dependent since pretreatment with flavopiridol abolished differences between the WT and ARID1A-knockout cells (Fig 1C). TRCs are also often associated with increased levels of R-loop structures. To test this, we used S9.6 antibody immunofluorescence which recognizes DNA:RNA hybrids and other RNA structures and found that ARID1A knockout cells had significantly higher S9.6 staining, comparable to previous studies (Fig 1D) [3,6,22]. While the signal could be reduced in both cell lines with RNaseH or RNaseIII treatment, targeting hybrids and dsRNA respectively, increased S9.6 staining persisted in ARID1A-KO cells treated with RNaseIII supporting a true increase in R-loops (S1D Fig). To ensure this was not a cell-line specific effect, we stained a p53-/-derivative of HCT116 and RPE1-hTERT cells treated with siRNA targeting ARID1A and found ARID1A-depletion increased S9.6 staining (S2A and S2B Fig). We also performed western analysis of RPA2-S33P and γH2AX in HCT116 cells depleted for ARID1A by siRNA and observed reproducible increases in these markers of DNA damage and replication stress (S2C Fig). Given concerns about the specificity of imaging with S9.6 [25], we also performed DNA:RNA immunoprecipitation (DRIP) experiments and quantitative PCR at a set of S9.6 positive loci. These included RNaseH-sensitive loci to confirm they represented DNA:RNA hybrids, and we found that the signal was increased at two of the four RNaseH sensitive sites in ARID1A-KO cells (Fig 1E). Importantly, when we assessed ARID1A binding at these sites in published datasets from RMG1 cells, we found that only the ARID1A+ sites BTBD19 and MYADM showed an increase in S9.6 signal when ARID1A was deleted (Fig 1E) [3,22,26]. Together these data show that in an RMG1, HCT116 or RPE1-hTERT cell line model, the deletion or depletion of ARID1A elicits replication stress, TRCs, and R-loop accumulation at a subset of genomic loci.

R-loop and TRCs drive replication stress and DNA damage in ARID1A knockouts
Unscheduled or dysregulated formation of R-loops has been linked to DNA damage, transcription-replication conflicts, replication stress and the activation of the Fanconi Anemia pathway [6,27]. To assess R-loop stabilization as a potential mechanism driving replication stress we undertook a set of suppression experiments. First, we confirmed that ectopic expression of RNaseH1-GFP strongly reduced the S9.6 staining in the ARID1A knockout cell line (Fig 2A). Consistent with a mechanism of R-loop induced breaks arising from replication stress, the increase in phosphorylation of RPA2-ser33 in ARID1A-/-cells was also suppressed by overexpression of RNaseH1, confirming R-loops as a driver of replication stress in these cells (Fig 2B). Repair of R-loop induced replisome collisions has been reported to require members of the Fanconi Anemia pathway, such as FANCD2 [27,28]. Consistent with this model we saw significantly more FANCD2 foci by immunofluorescence in ARID1A knockout cells, and these differences were abolished when cells were ectopically expressing RNaseH1 to remove R-loops (Fig 2C). To confirm that these R-loops ultimately led to DNA damage we performed a neutral comet assay for DNA breaks and observed that excess damage in ARID1A-deficient cells was also suppressed by ectopic expression of RNaseH1 (Fig 2D). Taken together these results indicate that ectopic formation or stabilization of R-loops in cells lacking ARID1A is a driver of replication stress, activation of the Fanconi Anemia pathway and DNA damage.

ARID1a deficiency alters replication and transcription dynamics
Both decreased and increased replication fork speed have been associated with replication stress and the DNA damage response [29]. To monitor potential replication defects in our system we used DNA combing of CldU and IdU labelled fibers in ARID1A-WT or ARID1A-KO derivatives. First, we measured the ratio of CldU to IdU labelled fiber tracks and found that ARID1A-KO cells have a greater degree of asymmetry between the two labeling periods, consistent with stalled or collapsed DNA replication forks (Fig 3A). Surprisingly, and in spite of the presumptive increase in fork stalling, when we measured total fiber length we found that on average replication forks travel faster in ARID1A-knockouts compared to WT controls (Fig 3B). To confirm this finding we also monitored fork speed in p53-/-HCT116 cells with ARID1A depleted by siRNA and observed a significant increase in replication speed (S2D Fig). This observed increase in fork stalling and speed is similar to what is seen when replication regulators like FEN1 or Treslin are perturbed, and reminiscent of the identified PARP-p53 axis for fork speed control [29].
Since the BAF complex opens chromatin to promote transcription, one possible mechanism by which replication forks may speed up is due to reduced transcription. Indeed bulk quantification of global nascent transcription by pulse 5-ethynyl uridine (EU) incorporation showed lower overall transcription activity in ARID1A knockouts (Fig 3C). Consistent with this, ChIP-qPCR of RNA polymerase II also showed lower occupancy at several transcribed genes in both ARID1A-KO or siARID1A-depleted cell lines (S3 Fig). In addition, we found that inhibiting transcription elongation with flavopiridol, or transcription initiation with the TFIIH inhibitor triptolide also increased replication fork speed significantly in ARID1A-WT cells, but not in ARID1A-KO cells (Fig 3D). Despite the different modes of action of flavopiridol and triptolide, these data suggest that global effects on transcription may explain the observed replication fork speed increases (see Perspective section). However, we also tested the direct association of the BAF complex with nascent DNA using SIRF for the catalytic subunit of BAF, Brg1. In this experiment, we saw a robust association in ARID1A-WT cells that was significantly reduced in ARID1A knockouts (Fig 3E). Therefore the BAF complex may have a previously observed but under-appreciated role in DNA replication [30]. Indeed, Brg1 has been implicated in regulating replication fork progression [30]. At this point it is not clear how broad transcriptional and chromatin state changes integrate with the DNA repair, and possibly replication functions of the BAF complex to suppress replication stress. We do not think that reduced transcription leading to high fork speeds explains the entirety of the observed R-loop associated replication stress. Flavopiridol treatment which also dramatically inhibits transcription and increases fork speed (Fig 3), reduces RNAPII-PCNA PLA signals (Fig 1). In addition, R-loops seem to become a barrier to genome stability in ARID1A deficient cells (Fig 2) suggesting that altered R-loop resolution at some loci may be a driver of DNA damage. Therefore we sought additional explanations.

Loss of ARID1A mislocalizes Topoisomerase IIα from R-loop sites
In concert with mechanisms of TRC induction involving altered transcription or replication dynamics, we also wondered if changes in the function of BAF physical interaction partners could be responsible for some of the observed phenotypes. BAF interacts with several genome stability maintenance proteins including Topoisomerase IIα (TOP2A), which has previously been implicated in regulating topological stress at TRCs [2,11]. First, we confirmed that global TOP2A inhibition with etoposide dramatically increased DNA:RNA hybrid staining with the S9.6 antibody and RPA2-ser33 phosphorylation in both ARID1A WT and knockout cell lines (Fig 4A and 4B). Interestingly, TOP2A depletion by siRNA increased S9.6 and RPA2-ser33P staining in RMG1 cells as expected, but did not increase staining in the ARID1A-KO clone (S4A and S4B Fig). This highlights the difference between etoposide, which creates inhibited TOP2A protein lesions on DNA, and siRNA which removes the protein. It also supports the idea that the reported abnormal TOP2A activity or localization in ARID1A-KO cells may be a partial driver of the observed replication stress [11]. To test this idea, we wanted to assess whether the association of TOP2A with R-loop prone loci was affected. First using proximity ligation assay with antibodies against TOP2A and S9.6 we saw a relative reduction in PLA foci in the ARID1A knockout cells (Fig 4C). Thus, despite having more R-loops overall, these data indicate that the recruitment of TOP2A to R-loop sites may be impaired in ARID1A knockouts. We considered that ARID1A loss might prevent TOP2A association with chromatin in general. However, fractionation analysis showed that TOP2A remained strongly associated with insoluble nuclear fractions which contain chromatin in ARID1A-KO cells (S4C Fig). We also wondered if loss of interaction with the BAF complex was driving the change in TOP2A-S9.6 PLA signal. Surprisingly, we found through TOP2A immunoprecipitation and western blot that TOP2A was strongly associated with Brg1 and BAF155 subunits of the BAF complex regardless of ARID1A status (S4D Fig). While previous studies have suggested that ARID1A is a key subunit for TOP2A-BAF interaction [11], in this cell line other protein contacts, presumably with ARID1B, are sufficient to maintain a TOP2A-BAF interaction. To look at whether TOP2A recruitment to specific loci was altered we performed TOP2A ChIP analysis and found reduced TOP2A binding at R-loop prone loci (Fig 4D). These loci had increased DRIP signal (Fig 1E) showing an inverse relationship where, upon ARID1A deletion, TOP2A recruitment decreases and S9.6 binding increases. Therefore, for some loci, our data support a model in which deletion of ARID1A reduces BAF binding, fails to recruit TOP2A, and leads to B, N = 3; � p<0.05, ��� p<0.0005 by Fisher's exact test; mean ± SD. For C, N = 3; � p<0.05, ��� p<0.0005, ���� p<0.0001 by Fisher's exact test; mean ± SD. (D) Neutral comet assay in WT and ARID1A-KO RMG1 cells with a control GFP vector or expressing GFP-RNH1. N = 3; ���� p<0.0001 by ANOVA; mean ± SEM.

Perspective
The BAF complex has pleiotropic effects on cells due both to its transcriptional regulatory network and to an array of functions in chromatin topology, DNA repair, and replication. In cancers with disrupted BAF complexes, transcriptional reprogramming likely coincides with chromatin states in the cell of origin to drive oncogenesis in specific tissues [18,31]. In addition, the BAF complex directly interacts with several candidate genome stability regulators such as ATR, BRIT1, p53, and TOP2A [11,12,32,33]. Thus, alterations in BAF can change the targeting or regulation of these proteins and potentially influence genome stability [10]. Shen et al., suggest that ARID1A recruits BAF to double strand breaks through interactions with ATR and promotes end resection [34]. The consequences of ARID1A in this setting is a homologous recombination defect and sensitivity to PARP inhibitors [34]. It is worth noting that other studies have suggested that ARID1A loss is associated with PARP inhibitor resistance [35], and therefore there remain important context-specific details to be elucidated for this approach to be successful clinically. Subsequent screening efforts identified ATR inhibition as a potentially effective synthetic lethal strategy to target ARID1A deficient tumours [19]. Our study frames these observations in a new light by directly demonstrating that ARID1A deletion increases transcription-replication conflicts and R-loop associated genome instability. We suggest that one mechanism by which this R-loop associated instability occurs is through dysregulation of TOP2A recruitment to specific loci in the genome. Previous studies have established that TOP2A binding to the BAF complex affects its topoisomerase activity and genomic localization [11,36]. Our data shows loss of proximity ligation between TOP2A and S9.6, along with reduced TOP2A binding at some R-loop prone sites. This is in the context of retained TOP2A chromatin association and interaction with residual BAF complexes lacking ARID1A. In light of this, we propose that specific regions of the genome normally bound by TOP2A through interactions with ARID1A are now TOP2A deficient and accumulate R-loop associated replication stress. In this model TOP2A is still distributed and active elsewhere in the genome based on other recruitment mechanisms.
Given the many roles of BAF complexes it is likely that some of the observed R-loop associated stress arises due to other mechanisms. First, BAF complexes have been implicated in DSB repair by regulating DNA end resection and Rad51 loading [13] and DNA:RNA hybrid intermediates are now known to be part of some repair reactions [37,38]. In addition, we now know that resection of stalled or reversed replication forks leads to engagement of DSB repair proteins to facilitate fork protection and restart [39,40]. Any roles for the BAF complex in this process are currently unknown. R-loops and TRCs are also associated with specific chromatin Replication forks were labelled with CldU (green) for 15 minutes, followed by a 15-minute pulse of IdU (red). Replication fork symmetry was measured by calculating the length ratio of CldU/IdU tracks. This is illustrated as C and I, and ratio C/I, on cartoon schematics below fiber images. N = 2; ���� p<0.001 by unpaired t-test; mean ± SD.  states that might influence, or be influenced by, BAF complexes [41,42]. For example, recent work in yeast models has found that H3K4 methylation marks deposited by transcription are associated with reduced TRCs and genome instability by slowing incoming replication forks like 'speed bumps' [43]. In mammalian systems BAF is known to physically associate with H3K4me1 through its BAF45c subunit [44]. If a replication slowing mechanism proposed for this chromatin mark in yeast is conserved to human cells then our fork speed and TRC data are consistent with a chromatin-based TRC avoidance mechanism supported by ARID1A. Importantly, we did observe decreased transcription and faster replication forks in ARI-D1A-KO cells. Increases in fork speed, and concomitant increases in replication stress have been reported in other settings. Defects in origin firing that lead to large inter-origin distances are associated with faster replication fork progression [29,45]. Maya-Mendoza et al. shows this effect of increased fork speed for depletion of origin regulators like Treslin, lagging strand processing enzymes like Fen1, and importantly described a new role for PARylation and p53 in suppressing fork speed and preventing replication stress [29]. Given the complexity of the BAF complexes interactions with DNA repair proteins and chromatin state, we currently do not know how the observed fork speed increases are manifested. At present our data describing TOP2A localization and R-loop mediated replication stress in cells lacking ARID1A do not incorporate the increased replication speed into the model. It is possible that these phenotypes of ARID1A-deficient cells are unlinked, or that there are unappreciated connections between N = 3; ���� p<0.0001 by ANOVA; mean ± SD. (B) Quantification of the relative RPA2-S33P nuclear intensity in WT and ARID1A-KO RMG1 cells treated with or without Etoposide (10 μM, 2 hours). N = 3; ���� p<0.0001, �� p<0.01 by ANOVA; mean ± SD. (C) Quantification (left) and representative images (right) of foci counts of TOP2A and S9.6 co-localization measured by proximity ligation assay. Foci represent instances when TOP2A is located in close proximity to R-loops. N = 3; ���� p<0.0001 by unpaired T-test; mean ± SD. (D) ChIP-qPCR probing for TOP2A in WT and ARID1A-KO RMG1 cells showing loss of TOP2A binding at R-loop prone loci BTBD and MYADM, but not at RNaseH1-insensitive sites 5'-TRIM and CALM3 from Fig 1. N = 3; � p<0.05 by t-test; mean ± SEM.
https://doi.org/10.1371/journal.pgen.1009238.g004 replication speed and R-loops in some contexts. Additional work dissecting direct and indirect roles for BAF complexes in replication dynamics will elucidate the potential coordination of replication, chromatin states, binding partners like TOP2A, and genome stability by the BAF complex.

Materials and methods
Cell culture, antibodies, and transfection RMG1 cells were cultivated in RPMI-1640 medium (Stemcell technologies) supplemented with 10% fetal bovine serum (Life Technologies) in 5% CO 2 at 37˚C. ARID1A was knocked out in RMG1 cells using CRISPR/cas9 technology with gRNA targeting exon 2 of ARID1A gene (5'-CTTGCTGCGGTCCTGACGGAGG-3'). Targeted sequencing with Illumina MiSeq confirmed homozygous deletion (c.1615 del C) in RMG1_AC14 ARID1A knockout single clone. For RNA interference, cells were transfected with siGENOME-SMARTpool siRNAs from Dharmacon (Non-targeting siRNA Pool #1 as si-Cont and si-ARID1A, catalog #D-001206-13 and #M-017263-01). Transfections were done with Dharmafect1 transfection reagent (Dharmacon) according to manufacturer's protocol and harvested 48 hours after the siRNA administration. For experiments with overexpression of GFP or nuclear-targeting GFP-RNaseH1 (gift from R. Crouch), transfections were performed with Lipofectamine 3000 (Invitrogen) according to manufacturer's instructions 24 hours after the siRNA transfections. Detailed catalog number and dilution information for all antibodies used in the manuscript can be found in S2 Table. Immunofluorescence For all immunofluorescence experiments, cells were grown on coverslips overnight before fixing. For experiments with GFP or GFP-RNH1 overexpression, plasmids were transfected 24hours post-seeding and were fixed 24-48 hours post transfection. For S9.6 staining, cells were fixed with ice-cold methanol for 10 minutes and permeabilized with ice-cold acetone for 1 minute. For all other stainings (i.e. RPA2-ser33P and FANCD2), cells were fixed with 4% paraformaldehyde for 10 minutes and permeabilized with 0.2% Triton X-100 for 10 minutes on ice. After permeabilization, cells were washed with PBS and blocked in 3%BSA, 0.1% Tween 20 in 4X saline sodium citrate buffer (SSC) for 1 hour at room temperature. Cells were then incubated with primary antibody [S2 Table] for 1 hour at room temperature or overnight at 4˚C. Following PBS wash, cells were then incubated with Alexa-Fluor-488 or 568-conjugated secondary antibodies for 1 hour at room temperature, washed with PBS for 3 times, and stained with DAPI before mounting and imagining on LeicaDMI8 microscope at 100X. Ima-geJ was used for image processing and quantification [46]. For the quantification of S9.6 intensity in RNaseIII and etoposide treated cells, cells were co-stained with anti-nucleolin antibody to mask out nucleolin-stained regions in order to quantify nuclear S9.6 signal only, outside of nucleolin, since nucleoli are hotspots of R-loop staining. For in vitro RNaseH or RNaseIII treatment, cells were treated with RNaseH (New England Biolabs, M0297S) for 2 hours or ShortCut RNaseIII (New England Biolabs, M0245S) for 20 minutes at 37˚C after permeabilization before blocking.

Proximity ligation assay and SIRF
PLA experiments were performed using the Duolink PLA kit (Millipore Sigma, DUO92101-1KT) and have been used previously to identify TRCs [22][23][24]. For all PLA experiments, cells were grown on coverslips overnight before fixing. For S9.6 staining, cells were fixed with ice-cold methanol for 10 minutes and permeabilized with ice-cold acetone for 1 minute. For all other staining (i.e. PCNA and RNAPII), cells were fixed with 4% paraformaldehyde for 10 minutes and permeabilized with 0.2% Triton X-100 for 10 minutes on ice. To inhibit transcription in control experiments, cells were pre-treated with DMSO or 0.8μM flavopiridol (Sigma, F3055) for 2 hours before fixation. After permeabilization, cells were washed with PBS and blocked 1 hour at RT in Duolink Blocking Solution. Cells were then incubated with primary antibodies [S2 Table] diluted in Duolink Antibody Diluent for 1 hour at room temperature or overnight at 4˚C. Cells were washed twice (5 min) in Wash Buffer A, after which they were incubated 1 hour at 37˚C with PLA probe mix (15ul/ cover slip, 1:4 PLA probe in PLA Antibody Diluent). Cells were washed (2x 5 min) with PLA Wash Buffer A, after which they were incubated 30 minutes at 37˚C with PLA ligation mix (15ul/cover slip, 1:40 PLA ligase 40X in 1:5 ligation buffer 5X in Ultra H2O). Cells were washed (2x 2 min) with PLA Wash Buffer A and incubated 100 minutes at 37˚C with PLA Amplification mix (15ul/slide, 1:80 polymerase solution in 1:5 amplification stock in ultra H2O). The cells were washed (2x 10 min) in PLA Wash Buffer B. After an additional wash (1 minute) in PLA Wash Buffer B 0.01X, the slides were mounted in Duolink Mounting Media with DAPI. Imaging was performed on a Leica DMI8 microscope at 100X. ImageJ was used for image processing and quantification. SIRF (in situ protein interactions at nascent and stalled replication forks) was carried out exactly as described previously [20]. Briefly, cells were incubated with 125 μM EdU (Sigma, 900584) for 8 minutes, washed twice with PBS and growth media was replaced. After 2 hours cells were washed with PBS two times before fixation with 3% PFA and permeabilized with 0.25% TritonX-100 in PBS. After blocking, cells were incubated with the following primary antibodies overnight: anti-Mre11 or anti-BRG1 and anti-biotin [S2 Table]. The rest of the protocol follows the PLA protocol above. Control images for SIRF experiments lacking an EdU pulse, or for S9.6-TOP2A PLA assays lacking primary antibodies or with etoposide treatment are shown in S5 Fig.

Comet assay
Neutral comet assay was performed as previously described [22] using the CometAssay Reagent Kit for Single Cell Gel Electrophoresis Assay (Trevigen, 4250-050-K) in accordance with the manufacturer's instructions.

DNA fiber assay
DNA fiber assay was performed as previously described (22) with some modification. Cells were untreated, treated with DMSO, or treated with 1mM tripolide (Sigma, T3652) for 2 hours and then labeled with CldU only for 15mins, or were pulse labeled for 15 minutes with 30μM CldU (Sigma, C6891), washed twice with PBS, and then pulse labeled with 250μM IdU (Sigma, I7125) for 15min. Cells were then collected and scraped into ice-cold PBS and genomic DNA was extracted with CombHeliX DNA Extraction kit (Genomic Vision, EXT-001) in accordance with the manufacturer's instructions. DNA fibers were stretched on vinyl silane-treated glass coverslips (Genomic vision, COV-002_IVD) with an automated Molecular Combing System (Genomic Vision). After combing, the stretched DNA fibers were dehydrated in 37˚C for 2 hours, fixed with MeOH:Acetic acid (3:1) for 10 minutes, denatured with 2.5 M HCl for 1 hour, and blocked with 5% BSA in PBST for 30 minutes. IdU and CldU were then detected with the following primary antibodies in blocking solution for 1 hour at room temperature: mouse anti-BrdU (B44) for IdU and rat anti-BrdU [Bu1/75 (ICR1)] for CldU [S2 Table]. After PBS wash, fibers were then incubated with secondary antibodies anti-Rat-Alexa488 and antimouse-Alexa568 for 1 hour at room temperature. DNA fibers were analyzed on LeicaDM18 microscope at 100X and ImageJ was used to measure fiber length.

DRIP and ChIP-qPCR
Cells were crosslinked in 1% formaldehyde for 10 minutes before quenching with glycine for 5 minutes at room temperature, and then lysed in ChIP lysis buffer (50mM HEPES-KOH at pH 7.5, 140 mM NaCl, 1 mM EDTA at pH 8, 1% Triton X-100, 0.1% Na-Deoxycholate, 1% SDS) and rotated for 1 hour at 4˚C. DNA were sonicated on a Q Sonica Sonicator Q700 for 8 minutes (30 sec ON, 30 sec OFF) to generate fragments of 200-500bp. For DRIP, the chromatin preps were treated with 20 mg/mL Proteinase K (NEB, 10012S) at 65˚C overnight and total DNA was purified by phenol/chloroform purification method. Protein A magnetic beads (Bio-Rad, 1614013) were first pre-blocked with PBS/EDTA containing 0.5% BSA and then incubated with S9.6 antibody (1:200, clone S9.6, MABE1095, Millipore) in IP buffer (50 mM Hepes/KOH at pH 7.5; 0.14 M NaCl; 5 mM EDTA; 1% Triton X-100; 0.1% Na-Deoxycholate, ddH2O) at 4˚C for 4 hour with rotation. DNA was then added to the mixture and gently rotated at 4˚C overnight. Beads were recovered and washed successively with low salt buffer (50mMHepes/KOH pH 7.5, 0.14 M NaCl, 5 mM EDTA pH 8, 1% Triton X-100, 0.1% Na-Deoxycholate), high salt buffer (50 mM Hepes/KOH pH 7.5, 0.5 M NaCl, 5 mM EDTA pH 8, 1% Triton X-100, 0.1% Na-Deoxycholate), wash buffer (10 mM Tris-HCl pH 8, 0.25 M LiCl, 0.5% NP-40, 0.5% Na-Deoxycholate, 1 mM EDTA pH 8), and TE buffer (100 mM Tris-HCl pH 8, 10 mM EDTA pH 8) at 4˚C, two times. Elution was performed with elution buffer (50mMTris-HCl pH 8, 10mMEDTA, 1% SDS) for 15 minutes at 65˚C. After purification with PCR Cleanup kit (Sigma-Aldrich, 1002801873), nucleic acids were eluted in 100 μL of elution buffer (5 mM Tris-HCl pH 8.5) and analyzed by quantitative real-time PCR (qPCR). For ChIP, DNA and antibody [IgG control, RNA polymerase II, TOP2A -S2 Table] were incubated in IP buffer overnight with rolling at 4˚C. Antibody-bound DNA was recovered using the Protein A or Protein G magnetic beads (Bio-Rad, 1614013 and 1614023), washed similarly as DRIP samples and treated with Proteinase K and RNAse after elution. Then, antibodybound DNA was purified with PCR Cleanup kit and analyzed by qPCR. qPCR was performed with Fast SYBR Green Master (ABI, LS4344463) on AB Step One Plus real-time PCR machine (Applied Biosystem). Primer sequences are listed in S1 Table. qPCR results were analyzed using the comparative CT method. The RNA-DNA hybrid and ChIP DNA enrichments were calculated based on the IP/Input ratio. DRIP positive sites for qPCR in BTBD19 and MYADM were taken from published studies [3] and their ARID1A occupancy was determined by analyzing GEO dataset GSM3392689 (S6 Fig) [47].

Nascent RNA labeling
Cells were seeded on coverslips and grown overnight. The next day cells were incubated with 0.5 mM EU for 1 hour. After EU labeling cells were washed with PBS twice and then fixed with 4%PFA and permeabilized with 0.25% TritonX-100 in PBS. EU incorporation was measured with Click-iT RNA Imaging Kits (Invitrogen, C10329) using Alexa Fluor 488 dye according to manufacturer's instruction. Slides were stained with DAPI before mounting and imaging on a LeicaDM18 microscope at 100X. ImageJ was used for image processing and quantification. For negative control, cells were first treated with 0.8μM flavopiridol for 2 hours before EU labeling.

Immunoprecipitation (IP) and western blotting
Whole cell lysates were prepared with RIPA buffer containing protease inhibitor (Thermo-Fisher Scientific, A32955) and phosphatase inhibitor (Sigma, 4906845001) cocktail tablets. For nuclear plasma and chromatin fraction proteins extraction, cells were lysed by 10 mM HEPES pH 7.4, 10 mM KCl, 0.05% NP-40 and incubated 20 min on ice then centrifuged and removed supernatant to keep nuclei pellet. The nuclei pellet was lysed with 10 mM Tris-HCl pH 7.4, 0.2 mM MgCl2, 1% triton X-100 and incubated 15 min on ice then centrifuged at 14,000rpm at 4˚C. The supernatant was nuclear plasma proteins and the pellet was lysed with HCl 0.2N and incubate 20 min on ice then centrifuged at 14,000 rpm at 4˚C, 10 min. Then added the same volume of 1M Tris-HCl pH 8 to neutralize supernatant chromatin proteins. For IP, cells were lysed by 10 mM HEPES pH 7.4, 10 mM KCl, 0.05% NP-40 and incubated 20 min on ice then centrifuged and removed supernatant to keep nuclei pellet. The total nuclear protein was lysed with CHAPS buffer containing protease inhibitor (ThermoFisher Scientific, A32955) and phosphatase inhibitor (Sigma, 4906845001) cocktail tablets. The protein concentration was determined by Bio-Rad Protein assay (Bio-Rad). Protein lysates were pre-clear in Protein G magnetic beads (Bio-rad, 1614023) at 4˚C for 2 hours with rotation. Protein G magnetic beads were incubated with TOP2A antibody [3ug-S2 Table] in CHAPS buffer at 4˚C for 2 hours with rotation. Protein extract was then added to the mixture and gently rotated at 4˚C overnight then beads were washed 4 times with CHAPS buffer and then boiled in Laemmli Sample Buffer (Bio-Rad, 1610747) for 10 min. Equivalent amounts of protein or IP pulldown samples were resolved by SDS-PAGE and transferred to polyvinylidene fluoride (PVDF) microporous membrane (Millipore, IPVH00010), blocked with 5% skim milk in TBS containing 0.1% Tween 20 (TBST), and membranes were probed with the following antibodies [S2 Table]: ARID1A, ß-actin, BAF155, BRG1, H3, Lamin B1, RPA2-ser33P, RANBP-3, TOP2A and yH2AX. Secondary antibodies were conjugated to Horseradish Peroxidase and peroxidase activity was visualized using Chemiluminescent HRP substrate (Thermo Scientific). ARID1A-KO RMG1 cells shows that TOP2A protein levels are not affected by ARID1A loss, and that TOP2A is still strongly associated with chromatin. LaminB1 is included as a control associated with insoluble nuclear material, including chromatin. RanBP3 is included as a control for soluble nuclear material. (D) Immunoprecipitation of TOP2A in ARID1A-WT and KO RMG1 cell lines. Actin is included as an input loading control, TOP2A is shown to confirm robust pulldown. Brg1 and BAF155 co-precipitated with TOP2A relative to an IgG control regardless of ARID1A-WT or KO status. For C and D, molecular weight marker positions are shown (right). (TIF)