Structural and biochemical studies of an NB-ARC domain from a plant NLR immune receptor

Plant NLRs are modular immune receptors that trigger rapid cell death in response to attempted infection by pathogens. A highly conserved nucleotide-binding domain shared with APAF-1, various R-proteins and CED-4 (NB-ARC domain) is proposed to act as a molecular switch, cycling between ADP (repressed) and ATP (active) bound forms. Studies of plant NLR NB-ARC domains have revealed functional similarities to mammalian homologues, and provided insight into potential mechanisms of regulation. However, further advances have been limited by difficulties in obtaining sufficient yields of protein suitable for structural and biochemical techniques. From protein expression screens in Escherichia coli and Sf9 insect cells, we defined suitable conditions to produce the NB-ARC domain from the tomato NLR NRC1. Biophysical analyses of this domain showed it is a folded, soluble protein. Structural studies revealed the NRC1 NB-ARC domain had co-purified with ADP, and confirmed predicted structural similarities between plant NLR NB-ARC domains and their mammalian homologues.


Introduction
Plants mount sophisticated responses to attack by pathogens following the detection of specific microbe-derived signals. General responses including callose deposition [1], production of reactive oxygen species [2], and upregulation of defence-specific genes [3] can be triggered by the detection of conserved microbe-derived pathogen-associated molecular patterns (PAMPs) such as fungal chitin [4,5] or bacterial flagellin [6,7] at the cell surface.
Such responses are generally effective at preventing or reducing colonisation of plants by non-adapted microbes, however they are less effective against specialised plant pathogens. Various sub-populations of plant pathogens have evolved to infect specific cultivars of a given host plant in a gene-for-gene manner [8,9]. A key feature of biotrophic or hemi-biotrophic plant pathogens are translocated effector proteins that enter the host cell, typically to subvert cell surface-based immune responses [10][11][12][13]. Plants have in turn evolved intracellular receptors capable of detecting the presence [14][15][16] or activity [17,18]

Defining the boundaries of the NRC1 NB-ARC domain
We used several bioinformatics resources to help define the boundaries of the NB-ARC domain from the canonical CNL NRC1 [50], with the aim of producing soluble, stable protein (Fig 1). A putative NB-ARC domain spanning residues 165 to 441 was identified by Pfam [51,52]. However, this region did not include the conserved MHD motif (valine-histidineaspartate, VHD, in NRC1) towards the C-terminus of the domain, a region known to be functionally relevant [34,35,49,50,53]. Therefore, we extended our analyses to include a combination of LRR-specific bioinformatic searches (LRR Finder [54] and LRRsearch [55]), secondary structure prediction (Phyre2 [56]), and disorder predictions (RONN [57]) to define a C-terminal boundary of the NRC1 NB-ARC domain. Based on the results of these searches, we assigned the NRC1 NB-ARC domain C-terminus at residue 494. The secondary structure predictions also suggested that truncating the protein at position 165 (putative N-terminus) could disrupt an α-helix. With this information, we re-assigned the NRC1 NB-ARC domain N-terminus at position 150, within a predicted disordered region between the CC and NB-ARC domains.

Expression and purification of wild-type NRC1 NB-ARC domain
Based on our bioinformatic analysis, a construct spanning NRC1 residues 150 to 494 was cloned into both the pOPIN-F and pOPIN-S3C expression vectors [60], which allow for protein expression with a cleavable N-terminal 6xHis tag or N-terminal 6xHis-SUMO (small ubiquitin-like modifier protein) tag respectively. We used the high-throughput capacity of Oxford Protein Production Facility, UK, to screen for expression/purification of the NRC1 NB-ARC domain in various E. coli strains/media and in insect cells. We found the best conditions for soluble expression used pOPIN-S3C in E. coli strain Lemo21(DE3), grown in PowerBroth media, and baculovirus-mediated expression in Sf9 cells using a pOPIN-F construct. Using a combination of immobilised metal ion chromatography (IMAC) followed by size-exclusion chromatography (SEC) we were able to purify a protein of molecular weight consistent with the 6xHis SUMO-tagged NRC1 NB-ARC domain from E. coli lysates (Fig 2A). Removal of the N-terminal tag (with 3C protease) followed by subtractive IMAC and a second round of SEC resulted in protein at high purity ( Fig 2B). A similar purification procedure for 6xHis-tagged protein expressed in Sf9 cells also yielded pure NRC1 NB-ARC protein (Fig 2B).
Having identified that baculovirus-mediated expression in Sf9 cells was able to generate soluble wild-type protein, we returned to insect-cell expression to screen NB-ARC mutant constructs using the pOPIN-F vector. With these screens, we could produce the mutant D268N (Walker B motif), and purify it to homogeneity in yields similar to the wild-type protein ( Fig  2B). The other mutants were recalcitrant to soluble expression/purification.

Biophysical characterisation of the NRC1 NB-ARC domain
To investigate the biophysical properties of wild-type NRC1 NB-ARC domain, and the NB-ARC D268N mutant, we used three approaches: analytical gel filtration, circular dichroism, and thermal denaturation.
Firstly, we used analytical gel filtration and showed that both wild-type protein (produced in E. coli or insect cells) and the D268N mutant behave as monomers in solution, and elute at a retention volume consistent with their predicted molecular mass of 40kDa (Fig 3A and 3B).
Secondly, we assessed NRC1 NB-ARC domain folding by circular dichroism (CD). Each wild-type sample gave a spectrum consistent with a well-folded protein, possessing mixed secondary structure ( Fig 3C, Table 1). Disruption of the Walker B motif (with the NRC1 NB-ARC D268N mutant) had little effect on protein secondary structure as measured by CD ( Fig 3C, Table 1).
Finally, we used differential scanning fluorimetry (DSF) to assess the thermal stability of the three proteins. In this assay, an increase in Sypro orange fluorescence is associated with a transition from a folded protein to denatured state as internal hydrophobic surfaces are exposed. For all three NB-ARC proteins, a single inflection point for change in fluorescence was observed at~50˚C, consistent with a single unfolding event ( Fig 3D, Table 1).
Taken together, these data are consistent with the wild-type NRC1 NB-ARC domain, and D268N mutant, adopting a stable, folded state in solution.

NRC1 NB-ARC domain purifies in an ADP-bound form
The full length flax NLR protein M, and the barley NLR MLA27, were both shown to co-purify with ADP as a bound nucleotide, with a small fraction of MLA27 bound to ATP [29,40]. Individual NB-ARC domains from animal systems have also been shown to co-purify with nucleotides, but these vary between ADP and ATP bound states [42,43,65]. We employed a luciferase-based assay ([40], also as described in the Materials and Methods) to quantify the relative occupancy of these nucleotides in the purified NRC1 NB-ARC domain. Using this assay, we were unable to measure any ATP co-purifying with the protein. We then converted any ADP present to ATP using pyruvate kinase, and used this as a proxy for bound ADP. As no ATP was bound to the protein, we express ADP occupancy as a percentage, calculated from relative protein/nucleotide concentrations. These data show that all three NRC1 NB-ARC proteins co-purified with ADP with high occupancy (Fig 4). Based on the higher affinity of APAF-1 for ATP over ADP [66], and ATPase activity displayed by other NB-ARC proteins, we expect this high ADP occupancy is due to intrinsic ATPase activity of NRC1 NB-ARC domain. Surprisingly, NRC1 NB-ARC D268N , a mutant designed to abolish ATPase activity, also co-purified with ADP suggesting either preferential binding of ADP over ATP, or residual ATPase activity.

NRC1 NB-ARC domain shows structural similarities to APAF-1
We generated a selenomethionine (SeMet) derivative of the NRC1 NB-ARC domain, which readily crystallised to form long, tapered rods. Crystals were flash frozen in liquid nitrogen and X-ray data collected at the Diamond Light Source (UK). The resulting data were processed to  3.03 Å resolution. Isomorphous native crystals of the NRC1 NB-ARC domain were generated, and X-ray data collected to 2.5 Å resolution (Table 2). Structure solution (by single wavelength anomalous dispersion (SAD)) and auto-building using the Phenix autosol pipeline, with subsequent phase extension to the high-resolution dataset, yielded an interpretable electron density map. The initial autobuilding placed 171 residues in 15 fragments. Iterative manual rebuilding and refinement resulted in a final structure consisting of 252 of a total 346 residues present in the expressed protein construct. Several regions of the protein could not be modelled due to discontinuous electron density, including much of the ARC1 subdomain, and the solvent-exposed α-helices of the NB domain. Submission of this model to the DALI server [67] returns APAF-1 as the major structural homologue, despite sharing little (24%) sequence identity ( Fig 5C). Side-by-side comparison of the NRC1 NB-ARC domain structure with the corresponding region from APAF-1 highlights the overall structural similarity in subdomain organisation (Fig 5A and 5B). Structural and biochemical studies of an NB-ARC domain from a plant NLR immune receptor

ADP is bound in the nucleotide binding pocket of NRC1 NB-ARC domain
The electron density within the nucleotide-binding pocket of the NRC1 NB-ARC domain, and the surrounding regions, was well resolved and allowed us to unambiguously position a molecule of ADP, and the side-chains of interacting residues. These included lysine 191 of the Walker A (P-loop) motif, aspartate 267 and 268 of the Walker B motifs, and histidine 480 of the MHD motif ( Fig 6A).
Using PDBe PISA [69], we defined a network of side-chain and main-chain interactions between the bound ADP and the protein (Fig 6B, Table 3). This network includes lysine 191 (P-loop motif) and histidine 480 (MHD motif), as well as hydrophobic interactions from the GLPL motif, and other surrounding residues, resulting in a ligand-binding pocket with over 90% of the ADP surface area buried within the protein. The result of this extensive interaction surface is a binding pocket partially occluded by the positioning of the ARC2 subdomain, suggesting structural rearrangements will be required to allow nucleotide release ( Fig 6C). Despite some regions of the refined electron density being well-resolved, the crystallographic R-factors for the final model remained high compared to other structures determined at similar resolutions. To help validate our model, we used the X-ray data set collected from SeMet-labelled protein crystals to generate an unbiased, experimentally-derived electron density map into which we docked the final model. We observed excellent correspondence of the docked model with this map (Fig 7). We were also able to position methionine residues at the experimentally determined heavy atom sites. Finally, electron density in this map confirmed the position of the bound ADP.  The expression, purification, and initial biophysical characterisation of the NRC1 NB-ARC domain demonstrated it was suitable for further biochemical and structural experiments. Using this purified NRC1 NB-ARC domain we have confirmed the hypothesised structural homology between a plant and animal NB-ARC domains [74]. The similarity of the NRC1 NB-ARC domain model to the equivalent region of APAF-1 validates the previous use of APAF-1 crystal structures to model and explain plant NLR mutant phenotypes [39, 49,53].

Discussion
Analysis of the nucleotide binding pocket of the NRC1 NB-ARC domain shows the presence of many residues from conserved NTPase motifs (Walker A, Walker B), as well as conserved NLR motifs that have established roles in protein regulation (GLPL, hhGRExE [49], and MHD). Further, the structure confirms that the NRC1 NB-ARC domain co-purifies with ADP suggesting that, like their animal counterparts, plant NLRs have high affinity for adenosine ligands and likely have either intrinsic ATPase activity, or have a binding preference for di-phosphate over tri-phosphate ligands. Low rates of ATP hydrolysis, and difficulties detecting ATPase activity, have previously been observed with studies of other NB-ARC proteins  Structural and biochemical studies of an NB-ARC domain from a plant NLR immune receptor [40, 66,75], and thus may constitute a regulatory mechanism for this family of proteins. As the NRC1 NBARC domain co-purifies with bound ADP, and not as apo-protein, it has not been possible to quantify ligand binding affinities. Most likely NB-ARC domains will be inherently unstable as apo-proteins, or will require additional factors (such as the presence of additional NLR domains, or external factors) to trigger ADP release.
The NRC1 NB-ARC domain structure in the vicinity of the bound ADP is unambiguously defined by the electron density, as are the relative orientations of the NB, ARC1 and ARC2  domains. However, due to the presence of other unresolved regions in the model, we suggest caution in interpreting additional details of the structure. Nevertheless, this limitation does not prevent analysis of the oligomerisation state of the protein in the crystal. In animal NLRs, NB-ARC domains form key interfaces with each other, contributing to the formation of inflammasomes [76,77]. Here, the NRC1 NB-ARC domain behaves as a monomer in solution and packing in the crystal lattice does not suggest a functionally relevant oligomerisation state. It is likely that other NLR domains, or structural rearrangements on activation, are necessary for NB-ARC domain oligomerisation. We have identified methods for producing a soluble, stable plant NLR NB-ARC domain to allow in vitro investigation of its properties. This resource, and our data, demonstrates how structural and biochemical studies can improve our knowledge of NB-ARC domain functions. Ultimately, in vitro studies of full-length plant NLRs will be required to place these new findings in a more biological context, and give a more global understanding of plant NLR regulation.

Construct design
Initial bioinformatics analysis to identify a likely NB-ARC domain was performed by submitting the NRC1 ORF to the Pfam server [51,52]. The extent of the LRR and coiled-coil domains were predicted using LRR Finder [54] and LRRsearch [55], and CCHMM [58] and MAR-COILS [59]. Prediction of secondary structure elements used PHYRE2 [56] and was supplemented with disorder predictions using RONN [57]. Sequence comparison between the NRC1 NB-ARC domain and structurally analogous region of APAF-1 was performed using ESPript [88], with the structure of APAF-1 (PDB 1z6t) manually trimmed using COOT [78].

Cloning
Wild-type NRC1 NB-ARC domain was amplified by PCR to include compatible ends for Infusion cloning using forward primer: AAGTTCTGTTTCAGGGCCCGCCTGTGGTTGAGGAAGAT GATGTGG and reverse primer: ATGGTCTAGAAAGCTTTACTCTGTCGTAGCCTCTTGCCAGC. The 1078 bp region was then cloned into pOPIN-S3C, which encodes a cleavable N-terminal 6xHis-SUMO tag, and pOPIN-F, which encodes a cleavable N-terminal 6xHis tag, as per manufacturer's instructions (Clontech). Mutant sequences were synthesised by Genscript and cloned into pOPIN-S3C and pOPIN-F as above.

Protein expression
Lemo21(DE3) (NEB) E. coli cells transformed with the NRC1 NB-ARC pOPIN-S3C plasmid were grown in LB media with 100 μg/mL carbenicillin at 37˚C for 16 hrs with constant shaking. 20 mL of culture was used to inoculate one litre Powerbroth (Molecular Dimensions). Cultures were grown at 37˚C until an OD 600nm 0.5-0.7 was reached and then rapidly chilled on ice prior to induction of expression with 1mM isopropyl β-D-1-thiogalactopyranoside (IPTG). Following induction, cultures were grown at 18˚C for 16 hrs, and harvested by centrifugation at 3500g for 10 mins. Pellets were then stored at -80˚C until use.
Insect cell-derived pellets expressing pOPIN-F:NRC1 NB-ARC were produced by baculovirus-mediated transformation of Sf9 cells by the Oxford Protein Production Facility and stored at -80˚C until use.

Protein purification, insect cells
Cell pellets were resuspended in 50mM Tris-HCl pH 7.5, 500 mM NaCl, 30 mM imidazole, 0.2% v/v TWEEN20 and EDTA-free protease inhibitor tablets and lysed using a cell disruptor (Constant Systems Ltd) at 14,000 psi. Following lysis, 10-50 μL Benzonase was added, and incubated at room temperature for 15 mins. Cell debris was removed by centrifugation at 30,000g for 30-40 mins at 4˚C. The supernatant was passed through 0.45 μm filters (Sartorius), with additional incubation with Benzonase if required, before loading onto a 5 mL His-Trap FF column pre-equilibrated with equilibration buffer (50 mM Tris-HCl pH 7.5, 500 mM NaCl, and 30 mM imidazole). Bound proteins were eluted with equilibration buffer including 500 mM imidazole. The eluate was then loaded directly onto a Superdex S75 gel filtration column pre-equilibrated with 30 mM Tris-HCl pH 7.5, 200 mM NaCl, and 1 mM TCEP. Buffer was then exchanged for 50 mM HEPES pH 7.5 with 150 mM NaCl either by gel filtration following N-terminal tag cleavage, or by concentration and dilution.

N-terminal tag removal
N-terminal 6xHis-SUMO (pOPIN-S3C) and N-terminal 6xHis (pOPIN-F) tags were removed by incubation with 3C protease overnight at 4˚C. Protease and cleaved tag were then removed by subtractive IMAC prior to gel-filtration.

Protein quantification
For crystallisation, analytical gel filtration, and CD, protein concentrations were determined using absorbance at 280 nm with a Nanodrop spectrophotometer. For ATPase assays and ligand quantification, concentrations were determined using infrared absorbance using a Direct Detect.

Crystallisation
Sitting-drop vapour-diffusion experiments identified suitable crystallisation conditions for native protein from the PACT screen (QIAGEN), with optimisation identifying the best conditions for crystal formation as 6 mg/mL protein combined 1:1 with 8% PEG 1500, 0.1 M MMT (MES, malic acid and Tris,) pH 4.5. Crystals typically formed in under 48 hrs. Selenomethionine-derivative protein was crystallised in 0.1 M MMT buffer pH 5.0 with 15% PEG 1500. Crystals were cryoprotected in mother liquor with the addition of 20% ethylene glycol.

Analytical gel filtration
Analytical gel filtration was performed using a Superdex S75 10/300 GL column equilibrated in 20 mM HEPES pH 7.5 and 150 mM NaCl. A standard curve was produced using a lowmolecular weight calibration kit (GE Healthcare) for molecular weight estimation. Protein was diluted to 50 μM in equilibration buffer prior to loading 100 μL onto the column.

Differential scanning fluorimetry
DSF was performed as described in Walden et al. 2014 [87]. Briefly, assays used a Bio-Rad CFX96 real-time thermal cycler with a thermal range from 20˚C to 90˚C, increasing by 1˚C per minute. Fluorescence readings were taken every 0.2˚C interval. Protein was prepared by dilution to 1 mg/mL in 20 mM sodium phosphate buffer pH 7.5, with 5 μL of dilution added to 45 μL reaction buffer (40 μL 20 mM sodium phosphate buffer pH 7.5 plus 5 μL 25x Sypro Orange (Invitrogen)). Assays were performed four times in 96-well plates, with five technical replicates per plate. Analysis was performed as described by Walden et al. 2014.

Circular dichroism
Circular dichroism was performed using a Chirascan Plus instrument (Applied Photophysics). Protein was diluted in 20 mM sodium phosphate buffer pH 7.5 to a concentration of 0.1 mg/ mL. Spectra were taken from 180 to 260 nm at 20˚C in 0.5nm steps with four readings taken per step, averaged and buffer subtracted. Spectra at 20˚C were taken from two independent purifications. Table 1 shows estimated secondary structure content from representative spectrum as analysed using CONTIN on the DichroWeb server [63,64].

Ligand quantification
ADP and ATP quantification assays were modified from Maekawa et al. [29] and Williams et al. [40]. Bound nucleotide was released from the protein by denaturation followed by centrifugation (which pellets unfolded protein). The resulting supernatant was then divided equally and either directly incubated in the luciferase assay (to determine the ATP bound fraction), or first treated with pyruvate kinase to convert ADP to ATP prior to incubation (ADP bound fraction was determined by subtracting the ATP calculated in the untreated sample from that of the treated sample). Here, 15-20 μM protein was boiled in triplicate for 7 mins and centrifuged at 16000g for 12 mins at 4˚C to precipitate denatured protein. The supernatant was removed and flash-frozen in liquid nitrogen until used. To convert released ADP to ATP for quantification, supernatant was diluted 1 in 10 in pyruvate kinase reaction buffer (85 mM Tris-HCl pH 7.4, 1.7 mM MgSO 4 3.5 mM phosphoenolpyruvate, 23U pyruvate kinase (Sigma-Aldrich) for 30 mins at room temperature. To quantify co-purifying ATP, protein extract was treated as above, with pyruvate kinase replaced by an equal volume of 50% glycerol. Reactions were stopped by heating to 95˚C for 7 mins and centrifuged at 16000g for 12 mins. 100 μL of supernatant was added to an equal volume of ATP bioluminescent assay buffer (Sigma Aldrich cat. FLAA) and photon emissions read immediately using a Clariostar plate reader (BMG). Each assay included an internal control of ADP, which underwent identical treatment to allow reference to a standard curve of ATP. Assays were performed three times and normalised by buffer subtraction and calibration with the internal standard. Data from all the experiments are presented as box plots, generated using R v3.4.3 (https://www.r-project.org/) and the graphic package ggplot2 [88]. The centre line represents the median, the box top and bottom limits are the upper and lower quartiles, the whiskers are the 1.5× interquartile range and all data points are represented as dots.