Phe-Gly motifs drive fibrillization of TDP-43’s prion-like domain condensates

Transactive response DNA-binding Protein of 43 kDa (TDP-43) assembles various aggregate forms, including biomolecular condensates or functional and pathological amyloids, with roles in disparate scenarios (e.g., muscle regeneration versus neurodegeneration). The link between condensates and fibrils remains unclear, just as the factors controlling conformational transitions within these aggregate species: Salt- or RNA-induced droplets may evolve into fibrils or remain in the droplet form, suggesting distinct end point species of different aggregation pathways. Using microscopy and NMR methods, we unexpectedly observed in vitro droplet formation in the absence of salts or RNAs and provided visual evidence for fibrillization at the droplet surface/solvent interface but not the droplet interior. Our NMR analyses unambiguously uncovered a distinct amyloid conformation in which Phe-Gly motifs are key elements of the reconstituted fibril form, suggesting a pivotal role for these residues in creating the fibril core. This contrasts the minor participation of Phe-Gly motifs in initiation of the droplet form. Our results point to an intrinsic (i.e., non-induced) aggregation pathway that may exist over a broad range of conditions and illustrate structural features that distinguishes between aggregate forms.


Introduction
Transactive response DNA-binding Protein of 43 kDa (TDP-43) is an RNA-binding protein that forms aberrant aggregates associated with disease contexts, including frontotemporal dementia (FTD) and amyotrophic lateral sclerosis (ALS) [1], or the recently reported "LATE" dementia, which had been misdiagnosed as Alzheimer disease [2]. TDP-43 also aggregates into functional amyloids [3] and participates in the assembly of various biomolecular condensates [4,5], which highlight the need for structural studies to understand the interplay between these aggregate forms. Structurally, TDP-43 contains a well-folded N-terminal [6] and 2 RNA-Recognition Motifs (RRM) domains [7] and a low-complexity, prion-like domain (PrLD) at its carboxyl terminus, which seems pivotal for the formation of the diverse amyloid and droplet forms. The TDP-43 PrLD is large (residues 267 to 414), comprising almost half the length of TDP-43, and is intrinsically disordered, with the exception of a short hydrophobic segment interface; and (iii) that such amyloid cores are stabilized by Phe residues from GAroS regions, located outside the central region containing the hydrophobic helix.

The TDP-43 PrLD intrinsically self-aggregates at pH 4 to enable droplet formation
We observed that the PrLD of TDP-43 aggregated at pH 4 in the absence of salt or RNA, which, to the best of our knowledge, was unexpected [13,14]. Because induced aggregation at higher pH values is reported to be driven by the central region containing a helical segment [9,10], we sought to confirm whether this general mechanism is also operative under our conditions, using liquid-state NMR (LSNMR).
Considering that the TDP-43 PrLD is of low complexity (i.e., poor amino acid variability), aggregation prone, and chiefly disordered, the LSNMR 13 CO, 13 Cα, 13 Cβ, 15 N, 1 HN, 1 Hα and 1 Hβ correlations were obtained using a nonconventional strategy that affords robust and complete assignments with a minimal set of experiments. In brief, this approach is based on the unambiguous determination of sequential connectivities between all spin systems (except prolines) by direct correlation of the 1 HN and 15 N amide groups of one amino acid residue to those of the next residue in the sequence, followed by the assignment of each 13 Cα and 13 Cβ nuclei to corroborate the identity of the sequential fragments interrupted by prolines (see Methods and S1 and S2 Tables for a more detailed description). This methodology proves useful for low-complexity stretches and aggregation-prone samples and allowed us to assign all residues under the conditions of this study (Biological Magnetic Resonance Bank [BMRB] entry 50154) [26].
The 1 H- 15 N HSQC spectrum of TDP-43 PrLD displayed poor chemical shift dispersion in the 1 HN dimension, consistent with a disordered domain (Fig 1A), and the NMR assignments confirmed the presence of the α-helix in the central region of TDP-43 PrLD (at residues 320 to 340) (Fig 1B). In addition to the helical region, Li and colleagues have shown that aromatic residues within the GAroS segments outside the central region are also relevant to some extent to the assembly process [11]. In order to interrogate the role of the helical component and the GAroS regions in the intrinsic aggregation of TDP-43 PrLD at pH 4, we analyzed peak intensities of the 1 H-15 N HSQC spectrum. This strategy has been used to characterize the helix-driven aggregation pathway of TDP-43 PrLD upon droplet induction with salt at higher pH values [10,11]. The peak intensities observed for the first 1 H-15 N HSQC spectra recorded within the first hour from sample preparation at 2 different concentrations (55 and 110 μM) revealed nonhomogeneous peak intensities throughout the PrLD sequence, with reduced intensity for the central residues containing the helical segment ( Fig 1B, black bars), which is interpreted as shift from the monomeric toward the aggregate state mediated by helix-helix intermolecular interactions, in line with previous observations [9][10][11][12]. Over the course of approximately 20 hours, the greater signal intensity loss of the central region became more evident at both concentrations ( Fig 1B, red bars). A closer inspection of these data revealed that, in addition to the central region containing the helix, other signals show reductions in signal intensity. Although these are minor changes, they are consistently observed over time and at the 2 concentrations (55 and 110 μM) as lower than average intensity values and mostly map to the 2 GAroS segments that are N-terminal (residues 267 to 310) and carboxyl terminus (361 to 414) to the central 311 to 360 region (Fig 1B, dashed horizontal lines). In particular, we highlight the following stretches of hydrophobic residues: 4 Phe-Gly-rich motifs at positions 276 to 284, 288 to 290, 367 to 368, and 396 to 403; 1 Tyr-Ser-Gly motif at 373 to 376; and 2 Trp-Gly motifs at positions 384 to 386 and 411 to 413 (Fig 1B and 1C). This result correlates very well with mutational analyses that showed that, in addition to the helical region, Phe283, Phe289 (in the GAroS segment that is N-terminal to the helix) and Tyr374, Trp385, Phe401, and Trp412 (in the GAroS carboxyl terminus to the helix) are required for droplet induction, since their mutation to glycines abolish salt-induced aggregation [11]. On this basis, we reasoned that the intrinsic aggregation we observed here does not follow an alternative mechanism with respect to that firmly established in the literature [9][10][11][12].
After approximately 7 (at 55 μM) or approximately 3.5 days (at 110 μM), the signals corresponding to residues within the central region and GAroS segments disappeared (Fig 1A, red  spectrum), and we observed that the NMR samples contained aggregated material displaying extensive droplet formation ( Fig 1D). Overall, these findings evince that (i) the TDP-43 PrLD self-assembles at low pH and without salt, conditions where it was reported to not aggregate Black peaks correspond to a spectrum recorded within 1 hour from sample preparation, the time elapsed between column elution and the start of the NMR experiments (see Methods). The spectrum in red was the last recorded on this sample (S1 and S2 Tables), where signals corresponding to some but not all residues disappear due to self-association. (b) Signals whose intensity is lost over time correspond to the central region spanning residues 311-360 (delimited by blue arrows), which contain an α-helix as evinced by 13 Cα chemical shifts (top panel). In addition, several short segments within the 2 regions N-terminal and carboxyl terminus to the 311-360 core (i.e., 267-310 and 361-414) also exhibit moderate signal loss, which are consistently observed at 55 μM (middle panel) and at 110 μM (bottom panel). This is illustrated by plotting the intensity values from the first 1 H-15 N HSQC (I 0 , black bars, 1 hour from sample preparation) and those from a 1 H-15 N HSQC recorded approximately 20 hours later, plotted as I/I 0 (red bars, I and I 0 correspond to intensity values for the second and first 1 H-15 N HSQC, respectively, normalized to the most intense peak). Note that average values for the 2 regions 267-310 and 361-414 N-terminal and carboxyl terminus to the central segment are shown as dashed horizontal lines (black and red colors for the I 0 and I/I 0 data sets, respectively), which highlights that the regions with lower than the average intensities mostly map to segments containing aromatic residues: Phe (yellow boxes), Tyr (gray box), and Trp (green boxes). (c) TDP-43's PrLD sequence illustrating the central region composed of residues 311-360 (blue lines, to match blue arrows in panel b), which was the focus of some earlier studies and contains the helical segment important for assembly, as well as the aromatic residues outside this core (Phe in yellow, Tyr in gray, and Trp in green). (d) At the end of the NMR experiments, the samples were found aggregated (left panel) and displayed extensive droplet formation (right panel, scale bar: 100 μm). This was imaged on the 110 μM sample. The 15 [13,14]; and that (ii) the aggregation process follows a helix-driven, GAroS-assisted mechanism of phase separation into droplets similar to that reported under droplet induction by salt or RNAs [9][10][11][12]. We interpret these observations as an intrinsic aggregation pathway of TDP-43 PrLD that is solely encoded by its amino acid sequence. Thus, it may well manifest over a broad range of conditions in cells, which would lead to the formation of relevant aggregate species at pH values below those normally considered physiological but relevant for metabolic stress conditions [5] and lysosome interiors [27]. Considering that all the prior work that uncovered the mechanism of droplet assembly did not detect fibrils (i.e., droplets were end point species following TDP-43 PrLD aggregation) [9][10][11][12], possibly since RNA and salt may hinder liquid to solid transitions, we next interrogated whether the droplets we detect could evolve into amyloid-like assemblies.
The TDP-43 PrLD droplet environment outbursts amyloid fibrils at the droplet surface/solvent interface Following the discussion of droplet versus fibrils as end point of distinct aggregation pathways, it has been very recently proposed that droplet assembly in FUS, hnRNPA1, and potentially most archetypical PrLDs, is encoded by its primary sequence as enabled by a symmetric distribution of aromatic residues. More precisely, this refers to aromatic Phe, Trp, or Tyr "stickers" residues separated from each other by stretches of 3 to 12 "spacers" residues, such as Ser or Gly [28]. Within this framework, such symmetric distribution is not fulfilled if the aromatic stickers cluster together in the primary sequence (i.e., without interspaced residues), and this causes irreversible fibrillization instead over droplet formation [28]. We identified such a "stickerand-spacers" symmetric distribution in the GAroS regions of TDP-43 PrLD, which may explain the intrinsic aggregation that we proposed to be encoded solely by its primary sequence. More precisely, the N-terminal region to the α-helix contains 3 Phe "sticker" residues, Phe276, Phe283, and Phe289, which are separated by 2 stretches of "spacers" spanning 6 (F 276 GGPNGGF 283 ) and 5 (F 283 GNQGGF 289 ) residues, respectively ( Fig 1D). Similarly, the GAroS region that is carboxyl terminus to the α-helix contains 6 aromatic residues, which can be grouped in the following sticker-and-spacers pairs: Tyr374-Trp385 (separated by 10 spacers: Y 374 SGSNSGAAIGW 385 ), Trp385-Phe397 (separated by 11 spacers: W 385 GSAS-NAGSGSGF 397 ), Phe397-Phe401 (separated by 3 spacers: F 397 NGGF 401 ), and Phe401-Trp412 (separated by 10 spacers: F 401 GSSMDSKSSGW 412 ) ( Fig 1D). Therefore, our interpretation that the lower peak intensities in the 1 H-15 N HSQC mapping to the GAroS regions resulted from intermolecular interactions driving the monomer to droplet transition ( Fig 1B) fits well within this stickers-and-spacers model for PrLD-containing proteins and is in agreement with the conformational evolution revealed by Trp fluorescence, as the wavelength of maximum Trp emission, which is high during the first 10 hours (approximately 357 nm), red-shifts to approximately 350 nm after 1 day, indicating a transition from a high polar environment toward a somewhat less polar milieu (Fig 2A). In line with these observations, light scattering values were low at the beginning and sharply increased over the next several hours, approximately coinciding with the NMR signal loss (Fig 2A). During this period, the scattering remains steady for several hours and then increases further after 1 day, suggesting the formation of 2 distinct assembled species.
One distinctive feature between TDP-43 PrLD and other PrLDs is the helical region; its tendency to self-aggregate not only reduces the number of "stickers" required for droplet formation [11], but also drives the assembly of amyloid fibrils in distinct aggregation pathways [13,14]. Similar to the light scattering experiment, fluorescence anisotropy measurements suggested the formation of 2 distinct species, particularly, low anisotropy values characteristic of flexibility are observed that the earliest time point (approximately 0.10) and quickly increase to moderate values (approximately 0.14). After 1 day, higher values that are similar to those seen on Trp side chains in folded proteins or rigid complexes are measured (Fig 2A). Interestingly, Thioflavin T (ThT) fluorescence enhancement was only detected when the putative second species form, but not earlier, suggesting that amyloid formation occurred following droplet assembly. Direct visualization by confocal microscopy reveals a strong enhancement of ThT fluorescence in the droplets formed in the 2 samples at 55 and 110 μM (Fig 2B-2D) and, consistent with the data presented, dense networks of fibrils were imaged by transmission electron microscopy (TEM) (Fig 2E). The observation of ThT-reactive droplets in 2 distinct samples that aggregated following the same mechanism, as uncovered by LSNMR (Fig 1B), together with the imaging of the fibrils by TEM and the analysis of the conformational evolution, firmly establishes that TDP-43 PrLD fibril formation is coupled to droplets under these conditions.
The above results evoke the recent work by Gui and colleagues, who used ThT fluorescence and confocal microscopy to provide direct visual evidence that fibrils from hnRNPA1another protein with a PrLD-formed from induced protein droplets [16]. A close inspection to both hnRNPA1 in Gui and colleagues' paper and TDP-43 PrLD in the present work revealed a critical difference: Whereas ThT fluorescence shines from the droplet interior in hnRNPA1 aggregates, indicative of fibrils confined within the droplet interior, the TDP-43 PrLD fibrils seems to be present at the surface, but not inside the droplet. One distinctive advantage of confocal microscopy is the possibility of scanning the entire volume of a droplet,

PLOS BIOLOGY
TDP-43 fibrils at the condensate/solvent interface feature Phe residues slice by slice, and this feature allowed us to look in closer detail at TDP-43 PrLD droplets. In Fig 3, we provide direct visualization that fibrils are in the surface and not in the interior of the droplet. Furthermore, they seem to detach upon reaching a critical size (Fig 3A and 3B).

The SSNMR signature of TDP-43 PrLD fibrils reveals an amyloid core that builds on aromatic stickers
We have shown that, at low pH, TDP-43 PrLD has an intrinsic ability to assemble droplets by virtue of its central region and the GAroS segments with aromatic stickers and spacers ( Fig  1A-1C). We have then shown that this intrinsic aggregation pathway couples droplet formation with the accumulation of amyloid fibrils at the droplet surface/solvent interface. What structural factors drive the conversion of fibrils? During droplet assembly, Trp-Gly motifs are the most important residues in assisting the helical region [9,11]. In fibrils, previous cryo-EM and SSNMR studies with the TDP-43 PrLD (311 to 360) peptides proved that the helical region suffices to form amyloid fibrils [15,19]. However, these peptides lack the GAroS regions (residues 267 to 310 and 361 to 414) and do not inform on their participation in the fibrillar structures, as they do in droplets [9,11] (Fig 1D). Thus, we ought to explore whether the symmetrically distributed aromatic residues within the TDP-43 PrLD GAroS would form part of the fibril core, by taking advantage of their unique structural signature in the NMR spectra.
We analyzed the aggregated material that became invisible to LSNMR by means of SSNMR. Although full structure determination by SSNMR requires multiple samples with different labeling schemes and in large quantities [29], we could reconstitute a modest amount of the fibril conformation that formed at the droplet/solvent interface. Cross-Polarization Magic Angle Spinning (CPMAS)-based experiments are sensitive to molecular mobility, allowing us to scrutinize the presence of aromatic residues immobilized within the fibril core. Fig 4A shows a 13 C-13 C spectrum recorded under appropriate conditions to ensure detection of sequential residues (i.e., j and j+1) that remain static and rigid. No Trp residues could be detected in the characteristic region corresponding to aromatic 13 C nuclei (i.e., no indole side chains are observed). This result is curious because Trp residues are key to droplet formation by the TDP-43 PrLD [9,11], such that their mutation to Gly abolish droplet initiation [11]. One potential explanation could be that Trp residues are exposed in the fibril, and not fixed in the core interior, such that their indole rings are on average less rigid and do not contribute strong signals in CPMAS-based experiments. However, the most striking observation is that we identified sequential contacts that correspond to several Phe/Tyr and Gly residues (Fig 4A), which unambiguously indicate that these motifs are immobilized within a fibrillar core containing β-rich segments, as evinced by distinctive cross-peaks in the Cα/Cβ serine region ( Fig  4A). This is reminiscent of structural models of other PrLD-containing fibrils, in which the amyloid core is formed by Ser/Gly/Aromatic-rich motifs [30]. Unlike Trp residues, mutation of Phe residues to Gly did not abolish droplet initiation, as long as the Trp residues remained unchanged [11]. This observation indicates that Phe-Gly motifs have distinct roles in the droplet and the fibril forms. While a structure determination is still in progress, for which we are optimizing the production of larger amount of samples with distinct labeling schemes, the 1D slices of the 2D data set already indicate that the signal from Phe/Tyr aromatic 13 C, Gly's 13 Cα, and Ser's 13 Cβ nuclei are unambiguously detectable under our experimental conditions ( Fig   Fig 4. Fibrils from the PrLD of TDP-43 are stabilized by Phe-Gly motifs. (a) Two-dimensional 13 C-13 C DARR spectrum recorded with a 250 ms mixing time on fibrils formed by the PrLD of TDP-43. This experiment detects residues immobilized within the fibril core. Fibrils are rich in β-sheet content, as indicated by the presence of Ser residues showing differences of 13 Cβ-13 Cα > 8 ppm (black boxes). Cross-peaks between Ser and Gly (black ellipses) indicate that Ser-Gly motifs, which are located outside the helical region, are relevant in the fibril form. Signals around 130-140 ppm (yellow shaded region) correspond to aromatic 13 C from Phe/Tyr residues. Considering that there is only 1 Tyr in the PrLD sequence, most cross-peaks arise from Phe ( 13 Cδ/ 13 Cε)-Gly ( 13 Cα) (black circles). (b) Horizontal slices extracted from the 250 ms DARR shown in a, taken at δ 1 = 27.03 ppm (in black), which illustrates a crowded region that covers aliphatic, aromatic, and carbonyl peaks; δ 1 = 45.07 ppm (in green), which is distinctive of Gly 13 Cα chemical shift values, illustrating signals contributed by aromatic 13 C nuclei from Phe/Tyr, as well as by 13 Cβ of Ser, as identified by their distinctive chemical shift values. This supports the presence of immobilized Gly, Ser, and Phe residues in the fibril core; and δ 1 = 49.69 ppm (in blue), which lacks strong resonances thus facilitating an appreciation of the overall signal-to-noise of the SSNMR data. (c) A short mixing time (20 ms) 2D 13 C-13 C DARR spectrum (in cyan) overlapped onto the 250-ms spectrum from panel a (in red). At this short mixing, only intraresidue contacts are detected, which supports the assignment of sequential contacts that build up at larger mixing times. All these spectra are available from Mendeley

PLOS BIOLOGY
TDP-43 fibrils at the condensate/solvent interface feature Phe residues 4B). When the same experiment is recorded with a shorter mixing time, only intra-residual correlations are observed, supporting the build-up of sequential connectivities as the mixing time is increased (Fig 4C). The GAroS regions of TDP-43 PrLD contains 6 Phe-Gly motifs, 3 of which exist as Phe-Gly-Ser triads (Phe367, Phe397, and Phe401) (Fig 1C). There is also just 1 Tyr residue (Tyr374), which appears as Tyr-Ser-Gly (Fig 1C). Tentatively, it seems reasonable to speculate that these motifs may be part of the structural core. Along this line, a recent cryo-EM structural model of fibrils from the entire PrLD of TDP-43 formed under conditions akin to those of the present work was posted in bioRxiv [31] after the submission of this manuscript. Interestingly enough, this cryo-EM study provides direct visual evidence that all Phe residues from the GAroS segment are present in the structure, as proposed here based on the unambiguous observation of various cross-peaks corresponding to Phe residues in the aromatic region of the 13 C-13 C Dipolar Assisted Rotational Resonance (DARR) spectra (Fig 4). To further test the role of Phe residues within and outside the central region spanning residues 311 to 360, 2 TDP-43 PrLD variants were generated. In 1 variant, the 2 central Phe residues were substituted to Ala residues, that is F312A+F316A. In the other, all 6 Phe in the GAroS were mutated to Ala, that is F276A+F283A+F289A+F367A+F397A+F401A. When incubated in the presence of seeds from the SSNMR samples, the F313A+F316A variant formed aggregates, while the hexa-mutated F276A+F283A+F289A+F367A+F397A+F401A variant did not (S1 Fig). This result unambiguously indicates that whereas the double mutation can be tolerated and incorporates into the fibril, the hexa-mutated variant cannot. This observation corroborates the creation of hydrophobic cores by Phe residues in the GAroS segments, highlighting the relevance of these motifs beyond the central helix in forming amyloid fibrils by the PrLD of TDP-43.

Discussion
To date, several groups have presented studies of intrinsically disordered protein domains undergoing phase separation, wherein the microdroplets formed eventually evolve into amyloids [14][15][16][17][18]. Until now, it has been generally assumed that the residues responsible for forming the microdroplet are also those that become amyloid [13,15,32]. Here, we have combined LSNMR and SSNMR to obtain atomic level information on both processes. In contrast to previous expectations, we have uncovered that certain residues are responsible for droplet formation, whereas others are key for amyloidogenesis, including aromatic "sticker" residues proposed to enable reversible droplet formation and not fibrillization [28]. This discovery has important implications for our fundamental understanding of these processes. Excitingly, it means that we should be able to develop inhibitors or modulators that target the residues that are key for one process (i.e., the Phe residues which promote amyloidogenesis), without affecting the other residues which promote physiological droplet formation to confront the stress situation. This is also supported by the observation that TDP-43 PrLD also forms droplets that do not evolve into fibrils [9][10][11][12]. Importantly, these features could not be recruited in model fibrils from the 311 to 360 segments that exclude the GAroS segments harboring the Phe-Gly motifs (residues 267 to 310 and 361 to 414).
We postulate that the fibril conformation that forms at pH 4 may represent a pro-pathological structure resulting from the persistence of a droplet state and that it may exist over a broad range of conditions based on the low content of titratable residues in the TPD-43 PrLD. This hypothesis is sustained by an intrinsic mechanism of droplet assembly, which remains essentially identical at pH 7 and in the presence of salts or RNA, and at pH4 without salts or RNA. Recently, Shenoy and colleagues carried out SSNMR measurements on TDP-43 PrLD fibrils assembled at pH 7.5 [33] and, interestingly, they did not report the Ser, Gly, and Phe/Tyr signals that we detect under similar mixing time conditions in our 2D 13 C-13 C spectra from fibrils assembled at pH 4 (Fig 2A). This suggests the reconstitution of alternative structures in the 2 studies, which we explain based on the droplet stage reported, rather than different pH conditions. The droplet environment enabled fibrillization at the droplet surface/solvent interface, and such fibril form was amplified by seeding soluble protein, as this strategy is commonly used to propagate selected amyloid conformations [34][35][36]. Shenoy and colleagues used a 4-mm rotor in their SSNMR study, for which they successfully prepared a large amount of fibrillar material, but did not inform the formation of droplets [33]. This is reminiscent of the different outcomes observed in cryo-EM and SSNMR studies on fibrils from the TDP-43 PrLD (311 to 360) segment, in which the fibrils that form from droplets revealed spectral signatures not compatible with the cryo-EM structures [15,19]. These observations in aggregate support the view that TDP-43 PrLD participates in distinct aggregation pathways, with different end point species that include (i) droplets without fibrils [9][10][11][12]19]; (ii) fibrils without droplets [13,33]; and (iii) droplets and fibrils [14,15], as interpreted with the data informed to date. The droplet to fibril transitions are of potential relevance, as cytoplasmic TDP-43 droplets were shown to sequestrate components of the nuclear pore [37] and co-aggregate formation between TDP-43 and nuclear pore components emerged as a hallmark of ALS/FTD [38,39], and they may well be hybrid amyloids similar to the RIPK1-RIPK3 necrosome core [29]. Amid these diverse findings, ours are the first to directly link droplets to amyloid formation by combining LSNMR and SSNMR. Deciphering the structural features of the various TDP-43 aggregate species in diverse contexts [3,21], such as the Phe-Gly motifs shown here to create a fibril core, is an important step toward understanding physiological versus pathological aggregation.

Protein expression and purification
The N-terminally hexa-His-tagged PrLD construct corresponding to TDP-43(267-414) was a gift from Professor Nicolas Fawzi (Addgene plasmid # 98669). Proteins were overexpressed in Escherichia coli BL21 Star (DE3) (Life Technologies Carlsbad, California, United States of America). Uniform 15 N and 15 N/ 13 C labeling of TDP-43(267-414) was achieved by overexpression in M9 minimal media supplemented with 15 N ammonium chloride and 13 C glucose as the sole sources of nitrogen and carbon, respectively. E. coli BL21 Star (DE3) carrying TDP-43(267-414) were pre-inoculated overnight at 37˚C. The day after, 200 ml of M9 minimal media were added to 4 mL of pre-inoculum, and the cell growth was monitored using Biophotometer D30 (Eppendorf, Hamburg, Germany) until the growth rate was 0.8 (OD at 600 nm). Then, the protein expression was induced overnight at 30˚C by adding 1 mM IPTG.
Cells were collected by centrifugation at 7,000 × g for 1 hour 4˚C, and the resulting pellet was resuspended in 30 mL lysis buffer (50 mM NaH 2 PO 4 , 300 mM NaCl, 10 mM Imidazole pH 8.0) supplemented with 1 tablet EDTA-free protease inhibitor cocktail (Roche Diagnostics Basel, Switzerland) and 1 mM PMSF (Sigma-Aldrich, St Louis, Missouri, USA). Cell lysis were performed using EmulsiFlex-C3 (Avestin Europe GmbH, Mannheim, Germany) and Bioruptor UCD-200 (Diagenode, Belgium). Inclusion bodies were recovered by centrifugation of cell lysate at 7,000 x g for 1 hour at 4˚C and were resuspended in 20 mL denaturing binding buffer (20 mM Tris-Cl, 500 mM NaCl, 10 mM Imidazole, 1 mM DTT, 8 M urea, pH 8.0). Further centrifugation at 7,000 × g for 1 hour at 4˚C was performed in order to pellet any remaining insoluble cell debris.
Ni-NTA Agarose beads (Qiagen, Gaithersburg, Maryland, USA) were used to bind the target protein and then were washed 3 times in 15 mL of denaturing binding buffer, and each washing step was interspersed with centrifugation at 500 × g for 5 minutes, room temperature (RT). Beads were then incubated with cell lysate for 4 hours rotating and successively collected by centrifugation at 500 × g for 5 minutes, RT. Finally, beads were washed 3 times in 2 ml of denaturing binding buffer, and protein was eluted with imidazole gradient buffers, as described below: Each elution step was interspersed with centrifugation at 500 ×g for 5 minutes, RT, and from the total elution pool, approximately 3 mg of protein are obtained.
The pH of the samples from the TDP-43 PrLD constructs was then lowered to 4.0, and applied to a PD-10 gel filtration column (GE Healthcare, Little Chalfont, UK), which had been previously pre-equilibrated with the LSNMR buffer: 1 mM CD 3 COOD in 85/15 H 2 O/D 2 O. The eluted fractions containing the PrLD construct, as identified by UV absorbance, were then concentrated to approximately 550 μL using an ultrafiltration device (Amicon, Danvers, Massachusetts, USA). The final urea concentration was less than 12 mM as determined by refractive index.

Choice of the best LSNMR assignment strategy
The TDP-43 PrLD is intrinsically disordered and low complexity (i.e., high redundancy in the amino acid sequence), and therefore poorly dispersed amide 1 H peaks and clustering of the 13 Cα and 13 Cβ chemical shifts around the random coil values are anticipated to complicate the assignment process. On the other hand, methods based on 13 C detection exploiting that 15 N and 13 CO signals remain well dispersed in disordered proteins cannot be applied to the TDP-43 PrLD, as these methods require high protein concentration conditions to overcome the low 13 CO sensitivity, under which TDP-43 PrLD quickly aggregates. With these limitations ahead, we resorted to 1 H-detected methods that mainly consist in connecting 2 consecutive NH groups through their correlations with 1 or more of the 13 C spins located between them: 13 Cα, 13 Cβ, and 13 CO. Due to the limitations mentioned earlier, we reasoned that six 3D experiments would be required to unambiguously obtain backbone chemical shifts, namely (i) HNCO; (ii) HN(CA)CO; (iii) HNCA; (iv) HN(CO)CA; (v) CBCA(CO)NH; and (vi) HNCACB. Recording this set of experiments with an acceptable resolution entails an average time of approximately 90 hours, as illustrated in S1 Table, which is by far too long for the TDP-43 PrLD to remain in solution due to its high aggregation tendency.
We then reasoned what would be the best strategy to accomplish the NMR assignments of our protein, ruling out the use of non-uniform sampling (NUS) schemes as the low concentration conditions the percentage of sampling that would be reduced may not represent a great time gain. With all these drawbacks, we proposed using a different strategy in which just three 3D experiments enabled unambiguous assignments of all signals in the 1 H-15 N HSQC spectrum of TDP-43 PrLD (S2 Table). The approach consists of 2 experiments, 1 H(NCOCA)HN and 1 (H)N(COCA)NH, that allowed us to directly connect consecutive amide groups from all signals in the 1 H-15 N HSQC. Using the information from a CBCA(CO)NH experiment, the spin systems are well defined through the 13 Cα and 13 Cβ chemical shifts. With this strategy, experiments are acquired in a time of approximately 55 hours, that is to say, 37.5% less than the conventional triple resonance strategy described above. These experiments were recorded on the 110 μM sample, along with an additional HNCA to corroborate the assignments and to obtain the 13 CA chemical shifts for the 4 residues preceding prolines (P280, P320, P349, and P363) and for the last residue M414. The experiment time of the HNCA was 5.5 hours (S2 Table), which still represent a reduction of 31% with respect to the conventional approach listed in S1 Table. Finally, a CC(CO)NH experiment was recorded on this sample (experiment time 22.5 hours) to assign side chain 13 C nuclei from Gln, Arg, Lys, Pro, Ile, and Leu residues. After these approximately 3.5 days of NMR measurements, some signals in the 1 H-15 N HSQC are missing, and the sample was aggregated. To report the full set of backbone assignments, the 13 CO and 1 Hα chemical shifts were obtained by recording a HNCO (5.5 hours) and a HBHA(CO)NH (33.2 hours) on the 55 μM sample. The latter experiment also provides the assignments for 1 Hβ nuclei.

LSNMR experiments
The above LSNMR experiments were collected at pH 4.0 and 298 K on a Bruker AV-800 US2 (800 MHz 1 H frequency) spectrometer equipped with a cryoprobe and Z-gradients. The 2 samples (55 and 110 μM) were prepared by concentration of the eluted fractions to approximately 550 μL, from which 300 μL of were transferred into D 2 O-matched, 5 mm Shigemi tubes. To obtain the 13 CO, 13 Cα, 13 Cβ, 15 N, 1 HN, 1 Hα, and 1 Hβ NMR assignments following the strategy presented in the previous section, we started from a root 1 H-15 N HSQC spectrum, which served as the first spectrum to carry out the peak intensity analyses in Fig 1. This experiment was measured again after approximately 20 hours with the following parameters: 4 scans, 12 and 20 ppm as spectral widths for 1 H and 15 N, respectively, and transmitter frequency offsets of 4.75 and 116.5 ppm for 1 H and 15 N, respectively. The remaining experiments to obtain chemical shifts to be deposited in the BMRB were recorded with the following parameters (see also S1 and S2 Tables Proton chemical shifts were directly referenced using DSS on a TDP-43 PrLD sample prepared for this purpose, and 13 C and 15 N chemical shifts chemical shifts were referenced indirectly. Conformational chemical shifts, Δ( 13 Cα), where calculated as δ 13 Cα(exp)-δ 13 Cα(ref), with δ 13 Cα(exp) being our measured chemical shifts and δ 13 Cα(ref) reference values obtained using the sequence of the TDP-43 PrLD construct and the data compiled by Poulsen, Dyson, and colleagues, at 25˚C and pH 4, as implemented in the chemical shift calculator (http://spin. niddk.nih.gov/bax/nmrserver/Poulsen_rc_CS) [40][41][42].
All spectra were processed using either NMRPipe [43] or Topspin 4.0.8 (Bruker Biospin, Germany), and peak-picking and spectral assignment was conducted using NMRFAM-Sparky [44]. The NMR chemical shifts are deposited in the BMRB under accession code 50154.

SSNMR experiments
SSNMR experiments were recorded on the aggregated material that formed following droplet formation during our LSNMR studies, which was amplified by seeding soluble protein prepared as described above. As a reference, from the most concentrated LSNMR samples (110 μM), the 300 μL from the Shigemi tube contains approximately 600 μg of material, assuming that 100% of the protein converted into fibrils when no signal is detected. After 3 weeks of incubation, the supernatant was carefully separated from the aggregated material settled on the bottom of the Eppendorf, which was lyophilized and transferred into a rotor using Bruker MAS rotor tools. The experiments were conducted on a Bruker 17.6 T spectrometer (750 MHz 1 H frequency) using an HCN 1.3 mm MAS probe. Moreover, 2D 13 C-13 C DARR spectrum [45] were recorded with mixing times of 20 and 250 ms, spinning the rotor containing the sample at a MAS rate of 17 and 20 kHz, respectively. The longer, 250-ms mixing time seemed appropriate to observe cross-peaks corresponding to sequential (j to j+1) residues, to detect correlations between Gly and Phe, Trp or Tyr, based on the characteristic chemical shifts of the Gly 13 Cα and 13 Caro chemical shift values distinctive for each type of aromatic residue. The spectra were recorded with the acquisition parameters listed in S3 Table, 64 scans, using a spectral width of 220.9 ppm in the direct and indirect dimensions, with acquisition times of 20 and 2.5 ms, respectively, and setting the transmitter frequency offset to 100 ppm. The DARR spectrum was processed using Topspin 4.0.8, with chemical shifts referenced to DSS.

Visualization of amyloid fibrils and protein droplets
Liquid droplets were directly visualized by spotting aliquots of the 2 (55 and 110 μM) samples onto glass coverslips using a Leica TCS SP2 inverted confocal microscope (Wetzlar, Germany) equipped with 7 laser lines, by both transmitted light ( Fig 1D) and fluorescence imaging ( Fig  2B). The latter were obtained by addition of 1 μL of 1mM ThT and laser excitation at 457 nm, following the protocol by Gui and colleagues [16]. To confirm that the ThT-reactive species were amyloids, the samples were directly adsorbed onto carbon-coated 300-mesh copper grids and negatively stained by incubation with 2% uranyl-acetate for direct visualization by TEM on a JEOL JEM-1011 electron microscope equipped with a TVIPS TemCam CMOS. Images acquired at a magnification of 30,000x and an accelerating voltage of 1,000 kV.

Fluorescence measurements
Fluorescence measurements were performed using a photon-counting Jobin-Yvon Fluoromax-4 spectrofluorimeter equipped with emission and excitation polarizers and a Peltier temperature control device. Fluorescence spectra of TDP-43 PrLD were measured at 25˚C using a 280-nm excitation wavelength and recording emission from 270 nm to 400 nm with a scan speed of one half second per nm and excitation and emission slit widths of 2 nm. Using this approach, the 90˚light scattering can be obtained as the apparent emission at 280 nm, while the Trp maximum emission wavelength and intensity provides information on how polar the indole fluorophore's environment is. Steady state anisotropy (A), measured at the wavelength of maximal emission, was calculated as where VV is the steady state emission where both the excitation and emission polarizers are set to vertical (0˚), but the emission polarizer is set to horizontal (90˚). Values of anisotropy typically range from <0.10 for mobile protein Trp indole groups in statistical coils to >0. 20 for Trp fixed in the rigid core of a folded protein [46].

Cross-seeding experiments
The wild-type TDP-43 PrLD (residues 267 to 414), as well as the 2 mutant variants were desalted from the 8 M urea-containing solution as explained above using a PD-10 column. A total of 5 μM of each protein variant in 1 mM CD 3 COOD (NMR buffer) with 40 μM ThT was assayed for their ability to cross-seed with the fibrillar aggregates studied by SSNMR, by adding 2% of seeds to the reaction (0.12 μM). The seeding reactions were measured at 25˚C with excitation and emission wavelengths of 450 nm and 480 nm, respectively.

S1 Fig. Cross-seeding experiments.
Cross-seeding in the polymerization into amyloid fibrils of the WT with respect to a double mutant, F313A+F316A, and a hexa-mutated construct, F276A+F283A+F289A+F367A+F397A+F401A, was studied by ThT binding assays. The distinct columns represent the following samples: column 1 is a blank, consisting of ThT (40 μM) in 1 mM CD 3 COOD (blank), and columns 2 and 3 are 2 independent samples consisting of 2% (0.12 μM) of seeds (amyloid fibrils packed in the SSNMR rotor). The next 3 columns (4, 5, and 6) correspond to 3 independent samples from the WT TDP-43 PrLD (1 without and 2 with 2% seeds). In columns 7, 8, and 9, the results for the hexa-mutant (F276A+F283A+F289A +F367A+F397A+F401A, denoted as 6F > A) are presented (1 without and 2 with 2% seeds). Finally, columns 10, 11, and 12 correspond to the double mutant F313A+F316A (1 without and 2 with 2% seeds). ThT is at 40 μM in all instances, and the final protein concentration is always 6 μM. Seeds from the WT cannot induce amyloid formation in the 6F > A mutant protein, as revealed by the negligible ThT enhancement (p-values between "a" and "b" or between "b" and "c" <0.0001). In contrast, F313A+F316A is efficiently cross-seeded in the presence of 2% of seeds (p-value between "a" and "c" of 0.8224). Each independent sample is measured 5 times, and the corresponding numerical data can be found in S1 Data. PrLD, prion-like domain; TDP-43, Transactive response DNA-binding Protein of 43 kDa; ThT, Thioflavin T; WT, wild-type. (TIFF) S1 Table. Typical experiments for the unambiguous assignment of low-complexity, disordered proteins. Mompeán.