TDP-43 mutations link Amyotrophic Lateral Sclerosis with R-loop homeostasis and R loop-mediated DNA damage

TDP-43 is a DNA and RNA binding protein involved in RNA processing and with structural resemblance to heterogeneous ribonucleoproteins (hnRNPs), whose depletion sensitizes neurons to double strand DNA breaks (DSBs). Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disorder, in which 97% of patients are familial and sporadic cases associated with TDP-43 proteinopathies and conditions clearing TDP-43 from the nucleus, but we know little about the molecular basis of the disease. After showing with the non-neuronal model of HeLa cells that TDP-43 depletion increases R loops and associated genome instability, we prove that mislocalization of mutated TDP-43 (A382T) in transfected neuronal SH-SY5Y and lymphoblastoid cell lines (LCLs) from an ALS patient cause R-loop accumulation, R loop-dependent increased DSBs and Fanconi Anemia repair centers. These results uncover a new role of TDP-43 in the control of co-transcriptional R loops and the maintenance of genome integrity by preventing harmful R-loop accumulation. Our findings thus link TDP-43 pathology to increased R loops and R loop-mediated DNA damage opening the possibility that R-loop modulation in TDP-43-defective cells might help develop ALS therapies.

Introduction TDP-43 is a nuclear RNA binding protein (RBP) with a repressor role of HIV-1 transcription. It binds to the trans-active response element DNA sequence of the viral genome [1,2]. Like other hnRNP proteins, TDP-43 binds to nascent pre-mRNA molecules when they are released from the RNA Polymerase II (RNApol II) and regulates RNA maturation either through sequential interactions with or in collaboration/antagonism with specific RNA binding factors [3]. TDP-43 is also involved in the regulation of non-coding RNAs like miRNAs and lncRNAs [4,5]. Thanks to its ability to recognize single-stranded DNA (ssDNA) or single-stranded RNA (ssRNA) with a preferential binding to (UG)n-enriched sequences [6], TDP-43 is involved in different steps of mRNA metabolism and in several mechanisms of genome integrity [7], consistent with the idea that RNA metabolism and DNA damage response (DDR) may be functionally interconnected [8].
Mutations in TDP-43 are associated with sporadic and familial cases of Amyotrophic Lateral Sclerosis (ALS), an adult onset, progressive neurodegenerative disease, caused by the selective loss of upper and lower motor neurons in the cerebral cortex, brainstem and spinal cord [9,10]. TARDBP is a major pathological gene for the ALS susceptibility and their mutations are found in 3% of familial and 2% of sporadic ALS cases [11,12]. Particularly, homozygous p.A382T TARDBP variation (A382T TDP-43) is one of the most common missense mutation in familial patients. A382T TDP-43 accumulation in the cytoplasm can reduce its physiological nuclear function, such as transcription regulation, mRNA splicing and transport [13][14][15] and miRNAs biogenesis [5,9]. Subsequent to this, the formation of oligomers and aggregates of TDP-43 in the cytoplasm may recruit native TDP-43 or other interactors proteins [16], constituting a gain of toxic function associated with neurodegeneration [17]. TDP-43 aggregates are identified as a major component of the ubiquitinated neuronal cytoplasmic inclusions deposited in spinal motor neurons both in familiar and sporadic ALS patients [18].
In addition to transcriptional autoregulation, TDP-43 can be cleaved into smaller C-terminal fragments before being enzymatically degraded to maintain its physiological levels [9,19] by a range of cysteine proteases, including caspases and calpains. Moreover, lines of evidence suggest that these CTFs can be produced via translation of an alternative transcript which is upregulated in ALS [20]. Recent studies proved that increased cytosolic sequestration of the poly-ubiquitinated and aggregated forms of mutant TDP-43 correlates with higher levels of DNA strand breaks, activation of DDR factors such as phospho-ataxia-telangiectasia mutated (ATM), phospho-53BP1, γH2AX in SH-SY5Y lines expressing wild-type (WT) or Q331Kmutant TDP-43 [21]. TDP-43 depletion leads to increased sensitivity to various forms of DNA damage and mutation in the C-terminus glycine-rich low-complexity region (LC domain) associates with the loss of its nuclear function [22]. In addition, TDP-43 colocalizes with active RNA polymerase II at sites of DNA damage along with the DDR protein, BRCA1, participating in the prevention and/or repair of R loop-associated DNA damage [23].
Evidence indicate that a major source of spontaneous DNA damage comes from the accumulation of R-loops, consisting in DNA-RNA hybrids and a displaced single strand DNA (ssDNA) [8]. Non-physiological R loops occur as unscheduled events formed co-transcriptionally that can compromise genome integrity. Increasing evidence [16,24] has highlighted a common association of increased R-loops with a variety of genetic diseases, including neurodegenerative disorders [25]. R-loop formation is enhanced in genomic regions containing highly repetitive DNA, which could facilitate the thermodynamic stabilization of RNA-DNA hybrids [26,27] and in cells mutated in genes encoding factors controlling R-loop homeostasis. Such factors are generally related to RNA processing and export or have DNA-RNA unwinding (helicase) or hybrid-specific ribonuclease (RNase H) activities [28,29]. However, a crucial role in prevention of R-loop formation is also played by the DDR. It is particularly notorious the role of BRCA2 and BRCA1 DSB repair factors or the Fanconi Anemia pathway (FA), especially FANCD2, involved in the repair of the inter-strand crosslinks (ICLs) and replication fork blockages [30,31]. Deficiency on any of these factors lead to harmful R-loop accumulation in human cells [8].
All this, together with the fact that a number of neurodegenerative diseases highlight a particular sensitivity of the nervous system and motor neurons are associated with deficiencies in RNA metabolism and DDR, prompted us to investigate whether TDP-43 deficiency, as found in ALS cells, have a role in R-loop homeostasis that could explain previously described DDR defects of ALS cells. We show that TDP-43 plays a role in preventing R-loop accumulation and R loop-mediated DNA breaks in neuronal and non-neuronal cells and in patient cell lines, thus opening the possibility that R-loop modulation in TDP-43-defective cells might help develop ALS therapies.

TDP-43 depletion causes R loops, DNA damage and FANCD2 repair centers in HeLa cells
A key regulatory role of TDP-43 in essential metabolic processes was previously suggested since silencing of TDP-43 in HeLa cells lead in dysmorphic nuclear shape, misregulation of the cell cycle, apoptosis, increase in cyclin-dependent kinase 6 (Cdk6) transcript and protein levels [32]. As a major readout associated with RNA transcription metabolic defects, we analyzed accumulation of nuclear DNA-RNA hybrids in TDP-43 depleted HeLa cells (siTDP-43 HeLa cells) as a reference cell line commonly used in R loop and genome integrity studies.
Genomic DNA-RNA hybrids in siTDP-43 HeLa cells were first assessed by immunofluorescence microscopy (IF) using the anti-DNA-RNA hybrid S9.6 antibody, and determining the levels of the S9.6 signal in the nucleoplasm after subtracting the nucleolar contribution [33,34]. As controls we used HeLa cells transiently transfected with a mock control vector expressing GFP (siC) or overexpressing the RNaseH1 enzyme, which specifically degrades the RNA moiety of hybrids. A slight but significant increase of R loops was observed in siTDP-43 HeLa cells, in which TDP-43 protein levels were reduced 75% (S1A Fig), in comparison to the siC ( Fig 1A). Efficient RNaseH1 overexpression from the pEGFP-M27 plasmid, as confirmed by IF (S1B Fig), reduced significantly the S9.6 signal, confirming that the signal detected corresponded to R-loops ( Fig 1A). A comparative analysis of the S9.6 signal intensity obtained for depletion of other cellular factors that protect cells from R loops in HeLa cells shows that the signal increase after TDP-43 depletion was similar to that obtained by depletion of other factors such as THOC1, UAP56, SETX, AQR, DDX23 mRNP processing factors (S1C Fig). Next, we determined R-loop accumulation by the more accurate method of DNA-RNA immunoprecipitation (DRIP)-qPCR, based specifically on the purification of genomic DNA-RNA hybrids of different sizes. In this case the S9.6 signal was determined for genes expressed at different levels such as APOE, RPL13A, WDR90, EGR1 and MIB2, which have HeLa immunostaining with antiS9.6 antibody and anti-nucleolin antibody. The graph shows the median of the S9.6 intensity per nucleus after nucleolar signal removal. Around 300 cells from three independent experiments were considered. Scale bar: 25μm. ��� , P < 0,0002; �� , P < 0,001 (Mann-Whitney U test, two-tailed). B) DRIP-qPCR using the anti S9.6 antibody at RPL13A, APOE, WDR90, EGR1 and MIB2 are shown in siC HeLa and siTDP-43 HeLa. Pre-immunoprecipitated samples were untreated (-) or treated (+) with RNaseH, as indicated. Data represent mean ± SEM from three independent experiments. � , P <0,05, �� , P < 0,01, ��� , P < 0,001 (Upaired t test, one-tailed). In all cases, when no asterisk is shown indicates that is not significant. https://doi.org/10.1371/journal.pgen.1009260.g001

PLOS GENETICS
TDP-43 ALS protein control R loop homeostasis and associated DNA damage been previously validated for the detection of R loops [31,33,35], and the poorly expressed SNRPN gene used as negative control [35,36]. We detected accumulation of DNA-RNA hybrids in the analysed genes in siTDP-43 HeLa cells compared to the siC HeLa cells, obtaining a significative increase on all genes tested (Fig 1B), whereas the SNRPN negative control did not show R loop accumulation (S1D Fig), further supporting the validity of our DRIP-qPCR methodology for R loop detection. Importantly, RNaseH treatment induced a dramatic signal decrease confirming that signals were R-loop specific ( Fig 1B).
Then, we investigated the functional impact of nuclear DNA-RNA hybrid enrichment on DDR, given that hybrids have been shown to enhance transcription-replication conflicts [37]. As can be seen in Fig 2A, γH2AX foci, as determined by IF, were significantly increased in siTDP-43 compared to siC HeLa cells. γH2AX foci significantly decreased after RNaseH1 overexpression, indicating that the damage caused by TDP-43 depletion is R-loop mediated. It has been shown that the Fanconi Anemia (FA) repair pathway is a critical pathway to resolve R loop-mediated DNA breaks as the result of transcription-replication collisions and that the FA factors work at the collisions [30,31,[38][39][40]. Therefore, we tested whether the damage generated by TDP-43 depletion was signaled by the FA pathway, for which we used the FANCD2 component [38]. As it can be seen in Fig 2B, FANCD2 foci were significantly increased in siTDP-43 HeLa cells compared to the siC control. Importantly, this increase was reduced by RNaseH1 overexpression, proving that TDP-43 depletion is responsible for an accumulation of the Fanconi Anemia repair factor caused by R-loop accumulation. The result is consistent with the idea that FANCD2 accumulates at R loop-containing sites at which the replication fork is blocked, similarly to inactivation of other RNA metabolic factors that lead to R-loop accumulation [38,41].

Genome-wide co-localization of TDP-43 at expressed genes
Our results show that TDP43 depletion causes R loop-accumulation, R loop-dependent DNA damage and accumulation of the transcription-replication collision-associated FANCD2 repair foci similar to depletion of other mRNP processing factors that function co-transcriptionally together with RNA polymerase II (RNAPII). Indeed, genome-wide ChIP-seq data obtained in K562 erythroblastoma cells (ENCODE Project; ENCSR033VAZ entry) reveals that TDP-43 colocalizes with expressed genes defined by RNA-seq ( Fig 3A). 6011 out of 6245 of genes recruiting TDP-43 correspond to actively expressed genes ( Fig 3B). Consistently, TDP-43 occupancy is significantly higher in expressed genes compared to genes with low or non-detectable expression, with a preference towards the 5' end of the genes (Fig 3C and 3D). Interestingly, analysis of the genes reported to be prone to accumulate R loops in K562 cells [42] reveals that 4809 of the 6245 genes are enriched in TDP-43. These results support that the TDP-43 RBP is present at expressed genes that are prone to accumulate R loops, where it might participate in mRNP biogenesis, similarly as proposed for other co-transcriptional RNA binding factors [8]. A remaining question is which specific role TDP-43 may play during transcription of those genes.

Cytoplasmic mislocalisation of mutated TDP-43 causes R-loop accumulation and leads to activation of the DDR and the Fanconi Anemia pathway
In ALS patients harboring TDP-43 mutations, TDP-43 mislocalizes from the nucleus to the cytoplasm in detergent-resistant aggregated forms either full-length (43 KDa) and fragmented forms (35KDa, 25KDa), which can be ubiquitinated and hyperphosphorylated [43]. We hypothesized that TDP-43 mislocalization due to missense mutations could have an impact on R-loop accumulation and DNA damage in ALS disease. For this we moved our studies to the SH-SY5Y neuroblast-like cells usually used as in vitro models of neuronal function and differentiation and to assay ALS related mutations. To determine TDP-43 cellular localization, we performed IF microscopy in basal SH-SY5Y, SH-TDP+ (overexpressing a GFP-tagged TDP-43 WT form), SH-TDP382 (expressing the GFP-tagged p.A382T TDP-43 mutant form) and SH-TDP294 (expressing the GFP-tagged p.G294V TDP-43 mutant form) cells using an anti-TDP-43 or anti-GFP antibody, able to detect the wild-type nuclear protein and the cytoplasmic full length and fragmented forms. In all cases, TDP-43 overexpression levels, as determined by Western were similar (S2A Fig), excluding the possibility that a potential different phenotype could be attributed to different TDP-43 levels rather than the dysfunction caused by the mutation itself. Western blot analysis of nuclear and cytoplasmic fractionation showed an increase level of GFP-fused proteins in the cytoplasm of SH-TDP+ and SH-TDP382, reducing the total nuclear fraction, including endogenous TDP-43 and GFP-fused, with respect to the In the case of SH-TDP294, the ratio between nuclear and nonnuclear signal did not show the same tendency, raising the possibility that protein dysfunction can be causative of the phenotype. However, in all cases, SH-TDP+, SH-TDP382 and SH-TDP294 the cytoplasmic fraction of the GFP-overexpressed protein was clearly higher compared to the SH-SY5Y control, consistent with a cytoplasmic mislocalization. From now we focused our study in the A382T mutant.
We confirmed by flow cytometry and IF that overexpression of both TDP+ and TDP382 occurred at similar levels (S2C and S2D Fig). In addition, in SH-TDP+, the RBP was localized preferentially in the perinuclear area compared to non-transfected SH-SY5Y cells, in which localization was predominantly nucleoplasmic. TDP-43 nuclear localization was significantly decreased in SH-TDP382 cells compared both to non-transfected SH-SY5Y and SH-TDP+ (S2E Fig). These changes could not be attributed to differences in nuclear area as these did not show any significant difference (S2F Fig), confirming that the A382T mutation is linked to TDP-43 cytoplasmic mislocalization with formation of inclusions or aggregates as previously reported for this and other mutations [44][45][46]. Results were the same when using the anti-TDP43 or anti-GFP antibodies.
Next, we tested whether mislocatization of TDP-43 could impact onto genomic integrity and R-loop accumulation in these cells. We first assayed whether overexpression of wild-type TDP-43 and mutated and mislocalized TDP-43 affected R-loop accumulation. Since both mutated and overexpressed TDP-43 may affect its physiological role in miRNA biogenesis, we added in this case an additional treatment with RNaseIII, which degrades specifically dsRNAs, to counteract the reported ability of S9.6 to detect dsRNAs [47,48]. A significant increase of nucleolar S9.6 intensity was detected both in SH-TDP+, SH-TDP382 and SH-TDP294 cells compared to SH-SY5Y ( Fig 4A and S3A Fig). Furthermore, RNaseH1 overexpression, as confirmed by IF (S3B Fig) caused nuclear S9.6 signal decrease in both SH-TDP382 and SH-TDP294 cells, but not in SH-TDP+ cells. Therefore, we conclude that ALS TDP-43 mutations also cause R-loop accumulation. From now on we focused our study on the TDP-43 A382T mutant protein as a representative mutant protein linked to ALS.
We performed DRIP-qPCR in the neuroblastoma cell lines in previously validated genes for R loop detection [33,35,36], to confirm the results involving TDP-43 role in preventing Rloop accumulation in human cells. Consistent with IF results a statistically significant increase in R loops was detected by DRIP-qPCR in RPL13A, WDR90 and EGR1 genes in SH-TDP382 cells, even though not for APOE likely due to an unknown cell type-specific effect ( Fig 4B). The levels were minimal for the SNRPN negative control (S3C Fig). RNaseH treatment dramatically decreased the levels of the signal in all cases, confirming that the signal detected was specific for nuclear DNA-RNA hybrids. Results were thus consistent with those obtained in TDP-43-depleted HeLa cells. Interestingly, an increase in the immunoprecipitated material was also detected in some genes in SH-TDP+ cells (Fig 4B). In this sense, it is worth noting that overexpression of wild-type TDP-43 has been reported to be detrimental to cells [49], and also led to a minor but significant decrease in nuclear TDP-43 abundance (S2B Fig). This could explain the minor but significant R-loop increase observed by DRIP at some genes in SH-TDP+, consistent with the high R-loop levels correlating with a low nuclear TDP-43 content, as is the case of SH-TDP382 cells.

PLOS GENETICS
TDP-43 ALS protein control R loop homeostasis and associated DNA damage Next, we tested whether the origin of such DNA damage was due to an increase in transcription-replication collisions enhanced by R-loops. We determined the levels of FANCD2 foci as previously reported. Notably, FANCD2 foci were significantly increased in SHSY-TDP382 mutant cells compared to SHSY-TDP+ and this increase was reduced by RNaseH1 overexpression ( Fig 5B). Therefore, the ALS pathogenic TDP-43 mutation in the analysed neuronal model leads to a comparable functional effect to that observed in silenced HeLa cells. The pathogenic TDP-43 mutation causes an increase in DNA breaks derived from R-loop accumulation that promotes transcription-replication collisions that are processed by the FA pathway, as reported for other cases of recombinogenic R-loops [30,31,[38][39][40].

Accumulation of R-loops in p.A382T TDP-43 mutated lymphoblastoid cell lines
Next, we used lymphoblastoid cell lines (LCLs) that derive from a TDP-43 mutated patient carrying p.A382T mutation (LCL-TDP382), a sporadic ALS patient (LCL-SALS) and a healthy control (LCL-CTL) (see Materials and methods) to confirm the role of TDP-43 in R-loops removal in ALS. We performed IF microscopy for S9.6 and TDP-43 in the three cell lines mentioned with two fixation methods, methanol ( Fig 6A) and paraformaldehyde (S4A Fig). R-loop quantification of the IFs confirmed a significant R-loop accumulation in LCL-TDP382 cells using both fixation methods. Concomitantly, both fixation methods guaranteed a decreased detection level of TDP-43 signal in the nucleus of LCL-TDP382 in comparison to LCL-CTL and LCL-SALS was also detected (Fig 6A and S4A Fig). This is in accordance with the mislocalisation of the mutated protein in the cellular cytoplasm, causing the loss of its nuclear physiological function [50]. Again, all these changes in nuclear content could not be attributed to changes in nuclear area as this remains the same (S4B and S4C Fig). Moreover, there is a colocalization of S9.6 signal with TDP-43 in the perinuclear area of LCL-TDP382 cells in comparison to LCL-CTL and LCL-SALS. R-loop quantification was also determined in LCLs by flow cytometry, in which case the analysis reported an increased positivity of S9.6 intensity in the orange peak associated with LCL-TDP382, in comparison to the blue peak associated with LCL-CTL (Fig 6B). The positive signal in LCL-TDP382 represented by the orange peak was clearly suppressed by RNase H treatment in the same sample detected as green peak, confirming that the detected signal corresponds to DNA-RNA hybrids (Fig 6B).
Finally, we investigated the possibility that TDP-43 could have a role on DNA-RNA hybridenriched chromatin, in which case we should expect some kind of physical association. Therefore, we wondered whether TDP-43 and genomic DNA-RNA hybrids colocalize by performing a co-immunoprecipitation (coIP) in chromatin (Chr) fractions from the three cell lines (S5A Fig). At the same time, we extracted whole lysate (WL) fractions from the same samples as control (S5B Fig). In the Chr fraction, co-immunoprecipitation could be observed with the S9.6 antibody. In LCL-TDP382, the TDP-43 mutant protein showed lower levels of co-IP, while in the WL fraction of the same sample the co-IP signal was higher than the CTRL-LCLs (S5B Fig), which suggests that the mutant full length TDP-43 was not able to interact with R loopenriched chromatin due to its sequestration at the cytosolic compartment in the cell [50], as seen before (Fig 6A and S4A Fig). Interestingly, the truncated TDP-35 form detected in Chr fraction of LCLs show high levels of S9.6 co-IP in the WL fraction of LCL-TDP382 in comparison with control LCL-CTL and LCL-SALS. This specific C-terminal mutation may predispose TDP-43 to fragmentation into CTFs, which as reported in literature are transported out of the nucleus and accumulated into complexes with RNA transcripts [49].  depletion of TDP-43, as achieved by either mislocalization to the cytoplasm or siRNA depletion in different cell types, causes a significant increase in harmful R-loops that leads to DNA breaks and FANCD2 foci. The results suggest that the TDP-43 RNA-binding protein has a key role in preventing R-loop accumulation as a safeguard of genome integrity.

Discussion
Silencing TDP-43 by siRNA in HeLa cells led to a significative increase of R-loop signal by S9.6 IF compared to the siC control. This was confirmed by reversion of signal in case of RNa-seH1 overexpression. DRIP-qPCR revealed an important R-loop presence on 5 protein-encoding genes tested. These include RPL13A ribosomal protein gene, whose deficiency could lead to alteration of protein homeostasis and RNA metabolism [53]. In this sense, it is worth noting that TDP-43 interacting protein networks have been shown to include RPL13A [54] and that both C9orf72 mut ALS patients' derived iPSCs and TDP-43-EGFP overexpressing iPSCs presented a set of commonly destabilized RNAs involved in the ribosomal pathway [55]. However, there is no evidence that these features have any relation to R loop accumulation at RPL13A; indeed, it is a phenomenon observed in non-ribosomal protein genes also (Fig 1).
The absence of nuclear TDP-43 affects the DDR, consistent with previous reports [23]. We showed that siTDP-43 HeLa cells had a significative increase of DSBs as determined by γH2AX foci. Importantly, this DSB increase was R loop-dependent, as it could be fully reverted by RNa-seH1 overexpression, and it was accompanied by an accumulation of FANCD2 foci that was also R loop-dependent. The result indicates that TDP-43 prevents the co-transcriptional accumulation of harmful R-loops that promote transcription-replication conflicts that have to be resolved by the FA pathway, consistent with the previously reported role for the FA pathway [37]. These results are particularly meaningful in the context of the putative role of TDP-43 in nuclear RNA processing, yet to be properly defined. It has been well-established that depletion of other factors involved in nuclear RNA metabolism such as the THO complex, UAP56, AQR, DDX23 or SRSF1 causes similar phenotypes as those described here for TDP43 mutations [8]. They all cause R loop accumulation and R loop-dependent genome instability that in cases like THOand UAP56-depleted cells have been shown to correlate with an increase in FANCD2 foci and transcription-replication conflicts [38,42]. Indeed, a very recent report that came out while ours was under review shows by DNA combing that siTDP43 cells undergo replication stress [56], consistent with our view. As it happens with those other RBP factors, genome-wide analysis of TDP-43 occupancy correlates with transcriptionally active genes, in which most of R loops accumulate (Fig 3). As shown by other mRNA processing factors, a paradigm of which is the THO complex, we propose that it is the suboptimal action of TDP-43 as an RBP during transcription what is linked to the appearance of R loops (Fig 6C). However, further research is required to completely understand how RBPs and a number of DDX proteins protect from R loops.
Notably, our assay of the impact of the TDP-43 pathogenic ALS mutation and overexpression in a neuroblastoma cell line, SH-SY5Y, revealed that the pathogenic A382T and G294V mutations in SH-SY5Y cells also affected the TDP-43 role controlling R-loop homeostasis, leading to a higher detection of genomic RNA-DNA hybrids, as detected by IF and DRIP-qPCR. As expected, DSBs and replication blockage detected by γH2AX and FANCD2 foci, respectively, were also increased in an R loop-dependent manner.
The effect of TDP-43 deficiency in R-loop homeostasis could be related to the accumulation of aberrant transcripts and hybrids that are trapped in persistent RNA-processing foci present in the cytoplasmic compartment of cell previously reported [57]. Our analysis of ALS patientderived LCLs as a valid cellular model to study the disease that carries typical features of degenerating MNs in ALS (i.e. protein aggregation, mitochondrial disfunction etc.) [46], also revealed an increased level of nuclear RNA-DNA hybrids in LCL-TDP382 as well as co-localization of S9.6 antibody with a fraction of TDP-43 in the perinuclear area. It is possible that either sequestering of the misfolded and mutated form of TDP-43 in inclusions could cause a loss of its nuclear function or the formation of TDP-43 aggregates in the cytoplasm could recruit native TDP-43 or other interactor proteins constituting a gain of toxic function [58]. The detection of higher S9.6 signal in both in LCL-SALS and of LCL-TDP382 underlines that R-loops may be a general condition in ALS, potentially improving RNA metabolism dysregulation and neurotoxicity that appear to be major contributors to the pathogenesis of this neurodegenerative disease.
It is worth noticing that RNAse III treatment was required to detect RNH-sensitive S9.6 reactivities (Fig 4A). Knowing that S9.6 can also detect dsRNAs [47,48] this result indicates that TDP-43 overexpression also leads to an accumulation of dsRNA molecules. Interestingly, TDP-43 has been reported to co-localize with Dicer and Ago2, but their interaction is inhibited by aggregates formation in response to cellular stressors or also by overexpression of human TDP-43 [58]. There is no evidence that overexpression and mutation of TDP-43 lead to double hairpin pre-miRNA accumulation in neuronal models, so it is formally possible that dsRNA accumulation observed in SH-SY5Y overexpressing TDP-43 could be associated to inhibition of Dicer processing function responsible for the loss of maturation of pre-miRNAs, presented as dsRNA hairpin structures in mature miRNAs. However, it may also be possible that excess of RBPs would prevent normal RNA metabolism by excess of cellular RBPs that would bind to any RNA molecule having secondary structure segments. A negative impact of RBP overexpression on RNA metabolism has been reported in other cases [59]. These are possibilities to explore in the future.
The chromatin fraction of TDP382 mutated LCLs showed a weaker association between TDP-43 and the S9.6 signal not observed in whole cell extracts, consistent with a lower TDP-43 presence in the nucleus. This decrease was not either observed in cells from a sporadic ALS patient (Fig 6). It is known that in ALS pathological conditions TDP-43 can generate CTFs such as 35 kDa fragments both upon cleavage by caspases at intrinsic caspase cleavage sites [11], both by via translation of upregulated alternative transcript [20]. Due to the lack of nuclear localization signal (NLS), TDP-35 CTFs mislocalize to the cytoplasm, where may associate with RNA forming cytoplasmic inclusions [60]. Indeed, the biochemical analysis suggests that TDP-35 facilitates aggregate assembly promoting inclusion formation [61] and might transport different types of RNA structures. TDP-35 can also recruit full-length TDP-43 to cytoplasmic deposition from functionally nuclear localization [62] and TDP-43 continuously shuttles between nucleus and cytoplasm in a transcription-dependent manner [63]. The higher TDP-35-S9.6 co-IP in the WL fraction of LCL-TDP382 compared to the other LCLs likely may reflect inclusions formed by dsRNAs.
It is becoming clear that impaired RNA regulation and processing is a central feature in ALS pathogenesis. Our study, reinforces the need of understanding the specific role of ALS in RNA metabolism, and in particular in cells defective in the TDP-43 RBP, beyond its effect on the formation of RNA inclusion bodies in the cytoplasm. Even though this is a common readout of ALS, our study showing an increase in genomic R-loops and DNA damage and Fanconi Anemia foci, expected for obstacles blocking replication, suggests that an important cause of the disease may be linked to impairment of nuclear RNA biogenesis and impact on DDR ( Fig  6C). The application of RNA-based therapies to modulation of gene and subsequent protein expression is an attractive therapeutic strategy, that could be considered in the future for the treatment of ALS and other neurodegenerative diseases.

Ethics statement
The study protocol was approved by the Ethical Committee of the IRCCS Mondino Foundation (Pavia, Italy). Subjects participating in the study signed an informed consent (Protocol n˚375/04 -version 07/01/2004). The study conformed the standards of the Declaration of Helsinki.

EBV inmortalization of cells from ALS patients
ALS diagnosis was made according to the revised El Escorial Criteria [65]. An healthy volunteer (male of 49 years old), free from any pharmacological treatment and pathology, was recruited at the Transfusion Centre of the IRCCS Policlinico S. Matteo Foundation in Pavia (Italy). Peripheral Blood Mononuclear Cells (PBMCs) from 2 ALS patients (one sporadic ALS patient not harboring mutations in the SOD1, FUS/TLS, TARDBP, C9ORF72 and one ALS patient carrying a homozygous p.A382T TARDBP missense mutation) and 1 control were immortalized with EBV as previously described [46]. PBMCs were isolated from peripheral venous blood by Histopaque-1077 (Sigma-Aldrich) following the manufacturer's instructions. Briefly, 5 × 10 6 PBMC cells were re-suspended in RPMI 1640 medium (Sigma-Aldrich), supplemented with 20% fetal bovine serum (FBS; Sigma-Aldrich), 0.3 mg/l L-glutamine, 5% penicillin-streptomycin and cyclosporine A (Sigma-Aldrich). EBV-mix, prepared according to Caputo and collaborators [66], plus RPMI 1640 with cyclosporin A was added to the cells. Cells were incubated at 37˚C in a humidified atmosphere with 5% CO2 for 1 week. The medium was then changed and cells were left in incubation until clusters of growing cells appeared.

Immunofluorescence microscopy
For S9.6 IF analysis in HeLa and SH-SY5Y, cells were fixed with cold methanol for 10 minutes at -20˚C according the literature [38]. SH-SY5Y cells were treated with 40 U/ml RNaseIII (1 U/μl, Thermo Fisher Scientific) for 30 minutes at 37˚using 1X RNase III Reaction Buffer. For γH2AX and FANCD2 IF analysis in HeLa and SH-SY5Y, cells were incubated with a fixation solution (PFA 4%, Triton-X 0,1%) for 10 minutes at room temperature (RT) as previously described [47]. For S9.6 and TDP-43 IF microscopy analysis in LCLs, cells were fixed with two methods: one using formaldehyde and cold acetone and the other using cold methanol.
The following antibodies were used: anti-nucleolin antibody (ab50279, Abcam), S9.6 monoclonal antibody (ATCC HB-8730 hybridoma),anti-TDP-43 antibody (Clone: 6H6E12, Proteintech), anti-γ-H2AX antibody (ab2893, Abcam), anti-FANCD2 antibody (sc-20022, SantaCruz), anti-H3S10P antibody (06-570, Sigma-Aldrich). Antibody signals were detected on a Leica DM6000 microscope equipped with a DFC390 camera (Leica). Data acquisition was performed with LAS AF (Leica). R-loops signal in the nucleoplasm of HeLa and SH-SY5Y cell lines was quantified using ImageJ program by measuring the S9.6 integrated density observed in the DAPI-stained nucleus, subtracting the nucleolar contribution, detected by nucleolin antibody. γH2AX and FANCD2 foci were determined in IF of HeLa and SH-SY5Y cell lines through the relative number of cells containing >10 foci in the nuclei in each condition. In each IF experiment we maintained around 100 counted cells for improving their comparability.
Nuclear area was obtained by measuring the DAPI area using FIJI (ImageJ) [67] DNA-RNA immunoprecipitation-qPCR DNA-RNA immunoprecipitation (DRIP) was performed on HeLa and SH-SY5Y cells as already described in literature [58,68]. The amount of R-loop levels was quantified as a function of input DNA, that for each sample was 10% of the entire amount. Primers used are listed in S1 Table.

Flow cytometry analysis
LCLs from a control (LCL-CTL), a sporadic patient (LCL-SALS) and a A382T TDP-43 mutant patient (LCL-TDP382) were incubated with a vitality dye for 15 minutes (Zombie Violet Fixable Viability Kit, BioLegend). Then, LCLs were incubated for 20 minutes with anti-CD19 antibody for B lymphocytes recognition. Cells were fixed and permeabilized using a kit based on saponin permeabilization (Fixation/Permeabilization Solution Kit, BD) as described [69]. As negative control for R-loop presence, cells were treated with 60 U/ml of ribonuclease H (RNase H, 5.000 units/ml, NEB) using RNase H buffer at 37˚C for 1 hour. As negative control for ssRNAs, cells were treated with 100 μg/mL ribonuclease A (RNase A, 10 mg/mL, Thermofisher) in 0.3M NaCl buffer at 37˚C for 1 hour. At the end, cells were stained for one hour with conjugated anti-S9.6 antibody (PE/R-Phycoerythrin Conjugation Kit, Abcam) and analysed by flow cytometry (BD FACS Canto II). Logarithmic amplification was used for all channels and FACSDIVA was used for the analysis. Moreover, SH-SY5Y transfection efficiency was analysed through flow-cytometry, detecting cellular GFP signal.

Quantitative PCR analysis
For real-time (RT)-qPCR analysis, cDNA was synthesized using QuantiTect Reverse Transcription Kit (Qiagen). mRNA expression values of the indicated genes were normalized with mRNA expression of the HPRT housekeeping gene. RT-qPCR was performed with iTaq Universal SYBR Green Supermix (Bio-Rad) and analyzed on 7500 FAST Real-Time PCR system (Applied Biosystems, Carlsbad, CA). Primers are listed in S1 Table. Viability assays SH-SY5Y cell viability was assessed by Trypan Blue assay. Briefly, cell suspension was mixed with 0.4% Trypan Blue (Sigma-Aldrich) and counted in three independent experiments with the automated cell counter TC20 (Bio-Rad) to evaluate the percentage of live cells, which was about 75-80%.

Genome-wide data collection
TDP43 ChIP-seq data was obtained from ENCODE project (www.encodeproject.org) ENCSR033VAZ entry, while DRIPc-seq, DRIP-seq and RNA-seq was gathered from Gene Expression Omnibus under accession number GSE127979 [42] Genome-wide data downstream analysis TDP43 ChIP-seq optimal IDR thresholded peaks from 2 biological replicates were retrieved from ENCODE (ENCFF909RMQ). DRIP-seq and DRIPc-seq sequencing reads (GSE127979) were mapped to the human hg38 canonical reference genome using Bowtie2 [71]