Innate immunity represents an important system with a variety of vital processes at the core of many diseases. In recent years, the central role of the Nod-like receptor (NLR) protein family became increasingly appreciated in innate immune responses. NLRs are classified as part of the signal transduction ATPases with numerous domains (STAND) clade within the AAA+ ATPase family. They typically feature an N-terminal effector domain, a central nucleotide-binding domain (NACHT) and a C-terminal ligand-binding region that is composed of several leucine-rich repeats (LRRs). NLRs are believed to initiate or regulate host defense pathways through formation of signaling platforms that subsequently trigger the activation of inflammatory caspases and NF-kB. Despite their fundamental role in orchestrating key pathways in innate immunity, their mode of action in molecular terms remains largely unknown. Here we present the first comprehensive sequence and structure modeling analysis of NLR proteins, revealing that NLRs posses a domain architecture similar to the apoptotic initiator protein Apaf-1. Apaf-1 performs its cellular function by the formation of a heptameric platform, dubbed apoptosome, ultimately triggering the controlled demise of the affected cell. The mechanism of apoptosome formation by Apaf-1 potentially offers insight into the activation mechanisms of NLR proteins. Multiple sequence alignment analysis and homology modeling revealed Apaf-1-like structural features in most members of the NLR family, suggesting a similar biochemical behaviour in catalytic activity and oligomerization. Evolutionary tree comparisons substantiate the conservation of characteristic functional regions within the NLR family and are in good agreement with domain distributions found in distinct NLRs. Importantly, the analysis of LRR domains reveals surprisingly low conservation levels among putative ligand-binding motifs. The same is true for the effector domains exhibiting distinct interfaces ensuring specific interactions with downstream target proteins. All together these factors suggest specific biological functions for individual NLRs.
Citation: Proell M, Riedl SJ, Fritz JH, Rojas AM, Schwarzenbacher R (2008) The Nod-Like Receptor (NLR) Family: A Tale of Similarities and Differences. PLoS ONE 3(4): e2119. doi:10.1371/journal.pone.0002119
Editor: Nick Gay, University of Cambridge, United Kingdom
Received: March 13, 2008; Accepted: April 1, 2008; Published: April 30, 2008
Copyright: © 2008 Proell et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by MCEXT-033534 to RS and MIRG-CT-2005-016499 to AMR. MP was supported by a DOC-FFORTE-fellowship and JHF by an APART-fellowship of the Austrian Academy of Sciences. SJR is a fellow of the V-foundation.
Competing interests: The authors have declared that no competing interests exist.
Eukaryotes have evolved complex systems to detect microbial infection and other potential threats to the host. Recognition of microbes relies on the sensing of microbe associated molecular patterns (MAMPs) by germline-encoded host pattern recognition molecules (PRMs), which include various families of leucine-rich repeat (LRR) bearing proteins in plants and animals. While Toll-like receptors (TLRs) constitute the main sensors for detection of extracellular microbes, recent findings suggest that two distinct protein families, the RIG-like helicases (RLHs) and the Nod-like receptors (NLRs), act as intracellular surveillance molecules –. Several proteins of the highly conserved NLR family have been shown to function as intracellular PRMs for the initiation of innate and adaptive immune responses upon pattern-specific sensing of microbes .
Like TLRs, NLRs are thought to recognize microbial products, as well as other intracellular danger signals, thereby initiating host defense pathways through the activation of the NF-kB response and inflammatory caspases . Moreover, the NLR family has gained increased attention, since polymorphisms in certain NLR genes are linked to inflammatory disorders such as Blau syndrome, Crohn's disease or early-onset sarcoidosis .
Structurally, NLRs are large multi-domain proteins with a tripartite architecture. NLR proteins typically contain a central nucleotide-binding domain termed NACHT domain (often also referred to as NOD domain), N-terminal effector domains (PYRIN, caspase recruitment domain CARD, or baculovirus inhibitior of apoptosis protein repeat BIR domain) for binding downstream signaling molecules, while the C-terminal part consists of a receptor domain, which is characterized by a series of leucine-rich repeats (LRRs). It is hypothesized that the crucial step in NLR activation lies in the oligomerization of the NACHT-domain, thereby forming an active signaling platform (e.g. the inflammasome or nodosome , , respectively), which allows binding of adaptor molecules and effector proteins, ultimately leading to an inflammatory response.
To date, 22 members of the human NLR protein family have been reported, which can be distinguished depending on the presence of a PYRIN, CARD, BIR, and a yet unclassified effector domain (Table 1).
According to the current general paradigm, NLR signaling is believed to be initiated by the C-terminal LRR region through the recognition of molecules triggering NLR activation. However, the actual molecular switch, namely the oligomerization of the NLR, then is thought to be mediated by the NACHT domain in a nucleotide-dependent manner. Recent studies show that Ipaf  and NALP3  selectively bind ATP/dATP and that nucleotide binding is essential for their function in downstream signaling. Once the switch has occurred, the signal is transferred to the effector proteins such as inflammatory caspases or adaptor molecules, via their effector domains. Thus, CARD-containing NLRs such as NOD1 and NOD2 are thought to interact with the CARD-containing kinase RICK (RIP2) leading to the activation of CARD9 and NF-κB pathways . In contrast, several PYRIN domain containing Nalp proteins were found to form a signaling platform, dubbed inflammasome, and drive caspase-activation by binding to the adaptor protein ASC , , , .
Despite the growing amount of research data, little is known about the precise molecular mechanism of NLR activation and the initiation of subsequent signaling cascades. Moreover, the structural and mechanistic data on NLR proteins is scarce and mainly limited to single effector domains. Recent studies by Albrecht et al discussed models of the NACHT and LRR domains of NOD2 and NALP3 in relation to disease associated SNPs and protein function . Here, we provide further insights into structural and functional relationships of NLRs based on detailed sequence and modeling analyses of the whole NLR family. We show that although Apaf-1 shares less than 15% sequence identity to most NLRs and contains a different receptor domain (WD40 repeats), its structure, mode of action, and mechanistic principles can serve as a valuable working model for NLR signaling. In addition, we investigated the N-terminal effector domains (CARD and PYRIN) of the NLR protein family to construct a prediction for potential interfaces and interacting partners. Furthermore, we analyzed sequences of the LRR domains for conserved regions that may play a putative role in ligand binding and/or interaction with the NACHT and effector domains. Finally, we created a homology model of NOD2 based on Apaf-1 and the ribonuclease inhibitor (pdb id: 1dfj), which we used to depict disease related polymorphisms and mutations .
Results and Discussion
1.1 NLR domain structure
To elucidate relations of NLR proteins, we used NLR sequences (see Table 1) as separate queries for FFAS searches  and secondary structure prediction with the PredictProtein Server . Furthermore, a domain profile search using Interpro and SMART  was performed to verify the domain structure of individual NLRs. Comparative sequence analyses (Figure 1, Table 2) revealed that all NLR proteins belong to the AAA+ ATPase superfamily  where they are further classified as signal transduction ATPases with numerous domains (STAND) . The STAND proteins are distinguished from other P-loop NTPases by the presence of unique sequence motifs associated with the N-terminal helix and the core β-strand-4, as well as a C-terminal helical bundle that is fused to the NTPase domain .
Degree of conservation is shown as blue shading. The secondary structure of Apaf-1 is shown above the Apaf-1 sequence. Arrows underneath the alignment indicate domain boundaries. Conserved sequence features important for catalytic activity are shown in black boxes. Orange and magenta boxes depict interfaces residues while the orange ones contribute to interactions with the left partner and the magenta residues are thought to interact with the right partner in the oligomer. Green boxes indicate additional motifs as described in the text.
A sequence profile based search using FFAS identifies Apaf-1 as a distantly related homologue of NLRs within the human genome. The main difference between Apaf-1 and NLRs is the lack of a LRR domain. Instead, Apaf-1 utilizes two sets of WD40 repeats as receptor domain for sensing cytochrome c as specific trigger of apoptosis . However, despite the different receptor domain, the remainder of Apaf-1 (residues 1–581) aligns with a sequence identity of 10–15% and a FFAS score of −16 to members of the NLR family. The significance of this sequence alignment is highlighted by the FFAS score of −16, which indicates high structural similarity despite the low sequence identity . Apaf-11–581 is also the closest hit among structurally characterized homologues, and therefore was chosen for homology modelling to decipher the mechanistic and structural features defining organization and function of NLR family members. Despite the existence of small differences in the ATPase domain of Apaf-1 and NLRs, our detailed secondary structure comparison and alignment analysis show that they share a common domain structure (Table 1), consisting of an N-terminal effector domain, a central NACHT domain, a winged helix (WH) domain, and a superhelical (SH) domain, followed by a LRR receptor domain of variable length (Figure 2A).
A Apaf-1 structure and domain organization. Apaf-1ΔWD40 (PDB id: 1z6t) is shown in ribbon representation color-coded according to domain boundaries (CARD aa 1–101, NACHT-GxP aa108–365, WH aa366–450, and SH aa451–586. ADP molecule bound to the active site is shown in sphere representation. B Evolutionary tree of human NLR NACHT-WH domain generated for NLR proteins and orthologues. Labels correspond to the accession numbers (Uniprot) Mm_1: Q3TAU8, Mm_2: Q2LKU8, Mm_3: Q2LKU9, Mm_4: Q2LKV8, Mm_5: A1Z198; NALP5_m: NALP5_MOUSE, Mm_6: Q4PLSO, Mm_7: Q66JP4; Mm_8: NAL4C_MOUSE, Mm_9:Q08EE9, Mm_10: Q8R4B8. Dr_1: A5PEZ1; Dr_2: A3KQD4, Dr_3: Q1AMZ9, Xl_1: Q28DS5, Xl_2: Q6GNU6. Black lines indicate a probability of more than 50%, blue lines indicate a 95% confidence in the grouping, and red lines indicate a confidence below the 40%. Green fonts indicate human proteins, black fonts alternative organisms. Mm and _m is Mus musculus (mouse), Dr and _Dr is Danio rerio (fish), Xl and _Xl is Xenopus leavis (frog). Orange fonts indicate an “S” in the Walker A motif in human sequences. Circles and modified ovals over the clades indicate the type of domain present at the N-terminal region of the NACHT domain. PYD is Pyrin domain, CARD is Card domain, Card? indicates CARD-like and BIR are BIR repeats. Apaf-1 is used to root the tree.
Variations thereof lie mainly within the effector domains, additional domains and within the length of the linker from the effector domain to the NACHT domain. Good examples for this are NOD2, which contains two N-terminal CARD domains, NOD5 and CIITA, which show an undefined N-terminal region. Other examples are the untypical secondary structure prediction for the CARD domain of Ipaf, or the partial sequence of the type 1 isoform of CIITA (accession number: AF000002) , which contains an alternative 5′ region that encodes a CARD domain.
In respect to the C-terminal LRR region NALP10 is the only NLR member that has no or only a very short occurrence of LRR repeats. NALP1 on the other hand shows a typical LRR region, but contains two additional domains C-terminally of those LRRs; a FIIND domain of yet unknown function and a C-terminal CARD domain (Table 1) that displays the typical secondary structure found in the N-terminal effector CARD domains of other NLRs. As outlined above and despite the low sequence identity the structure of Apaf-1 can be used for homology modeling purposes to obtain insight into the mechanism of NLR function and furthermore to produce approximate models of NLR structures.
1.2 NLR evolutionary profiles
Detailed sequence comparisons of 22 human NLR members reveal an overall sequence identity in the range of 10–30% amongst pairs. Since domain shuffling is a eukaryotic hallmark and has created a large complexity of functions in proteins, it hampers evolutionary analysis of full length proteins. In particular effector domains are subject to domain accretion, and/or domain shuffling or duplication for the acquisition of new domain architecture. Taking this into account, we have chosen the NACHT domain to conduct a phylogenetic analysis, addressing the possible evolutionary history of NLR proteins.
By comparing the NACHT region of the NLR family members, we observed that the phylogenetic distribution clearly correlates with their respective effector domain composition (Figure 2B). For instance, all the PYD-NACHT containing proteins clade together at the highest part of the tree, well separated from other domain combinations such as CARD-NACHT. In humans, these PYD-NACHT containing proteins have been expanded by several duplication events. Similar results were obtained by including other NLR sequences of non-human origin, demonstrating a clear distribution in agreement with the effector domain content. Moreover, 9 out of 14 proteins (NALP2, 4, 5, 7, 8, 9, 11, 12, 13) are located at chromosome 19 and clustered very closely together, which indicates a major expansion of this genomic region. Three other members (NALP6, NALP10 and NALP14) are located at chromosome 11, whereas NALP1 and NALP3 are located at chromosomes 17 and 1, respectively. Thus, we further analyzed whether these proteins have corresponding orthologues in closely related organisms. Clear orthologues were found for all proteins with the exception of NALP8, NALP11, NALP13 and NOD4. In addition we observed that NALP2 and NALP7 are recent duplicons within the human genome. For other organisms however, the expansion of this family originated from different members (data not shown), suggesting that NLRs of non-human origin have been lost during evolution. Moreover, these observations point to the possibility that the development for human paralogues reflects a way to accommodate novel functions to match the complexity of innate immunity in highly developed organisms.
1.3 The NLR NACHT-WH-SH region shows distinct adaptions in NLR function
Members of the AAA+ superfamily feature a so called ATPase, P-loop or Rossman-fold which adopts a three-layered α-β sandwich configuration. This fold contains recurring regulatory units with the β-strands forming a central, parallel β-sheet, which is embedded between α-helices on both sides (scop id C.37.1.20). The parallel β-sheet forming the core of the ATPase domain assumes a 51432 topology . This fold contains several characteristic motifs, namely the Walker A/P-loop and Walker B motif, and the Sensor 1 and Sensor 2 motif. These motifs are involved in ATP-binding and hydrolysis of the β-γ phosphate diester bond  leading to specific conformational changes.
To decipher the structure and mechanism of the NLR protein family we performed a multiple sequence alignment of NLR proteins and Apaf-1 (Figure 1) focusing on the NACHT-WH-SH domains using the program muscle . The outcome was then compared with FFAS  search results. Subsequently, we utilized secondary structure prediction and homology modelling (Figure 3) to decipher the presence of critical ATPase motifs like Walker A/P-loop, Walker B, Sensor 1 and Sensor 2 to deduce putative functional features unique to NLRs. As shown in the multiple sequence alignment (Figure 1), the overall secondary structure features of the NACHT domain are conserved among NLR proteins and Apaf-1 (Table 2). We observed that the only main difference is constituted by a deletion before β-strand 3 and a 20 residue insertion after β-strand 3 in the NACHT domain.
The Walker A motif is composed of the characteristic consensus pattern GxxxxGKT/S (x represents any amino acid), where the lysine residue directly interacts with a phosphate moiety of ATP . We observed that based on the presence of a threonine or serine residue in the GKT/S sequence motif, members of the NLR protein family can be subdivided into two groups (Table 2). This separation is reflected in the evolutionary tree (Figure 2B, where orange fonts indicate a presence of S instead of T) in which the phylogenetic distribution follows the T/S signature. Then, NALP1, NALP5 and NALP12 represent the “primordial” repertoire of proteins which yielded several duplicons in humans. Although, both residues, T and S, have been found in active ATPases, the detailed catalytic consequences of their preference in most NLR proteins remains undefined.
The Walker B motif of ATPases, located in the nucleotide-binding site, is characterized by the conserved sequence pattern hhhhDD/E (h represents a hydrophobic amino acid). The proximal aspartate residue is crucial for coordinating binding of the Mg2+ cation, which has been shown to be required for nucleotide hydrolysis. The second acidic residue, usually glutamate, primes a water molecule for the hydrolysis of ATP .
Generally, the Walker A/P-loop and Walker B motifs are well conserved amongst ATPases (e.g. Apaf-1 and CED-4). However, our multiple sequence alignment revealed that all NLRs, with the exception of NAIP and NALP11, contain a modified Walker B motif, where the second acidic residue is missing. Thus, it remains elusive if NLRs harbouring these substituting amino acids (glycine, alanine or serine) within the Walker B motif, hhhhD[GAS]hDE, are still capable of nucleotide hydrolysis (Table 2, Figure 1). Interestingly, a recent publication by Ting et al reports that NALP12, which also contains a modified Walker B box, is capable of both, ATP hydrolysis and oligomerization . Consequently, it is feasible to assume that NLRs use diverse mechanisms to prime the water molecule for ATP hydrolysis, where one might be the replacement of one conserved acidic amino acid by utilizing what we propose to term: an extended Walker B box. In many NLRs the extended Walker B box is composed of a conserved DE tandem motif that is located three residues downstream of the first D in the Walker B motif. Exceptions are CIITA, NALP4, NALP8, NALP13 with an EE, NALP5 with a DD, NOD5 with an EH sequence, Ipaf with NE and NALP11 with DN, respectively (Table 2). These data show that although observed for NALP12, the extent of ability and capacity to hydrolyse ATP may vary amongst NLR proteins based on their individual extended Walker B motifs. This is in line with findings that, in this respect, Apaf-1, CED4  and DARK  are extremely different, too. Therefore, the here defined extended Walker B box represents a key element for the further investigation of distinct NLRs, their function and the involvement of ATP hydrolysis in their specific signaling pathways.
The Sensor 1 motif is typically found adjacent to the Walker A and B motifs and interacts with or “senses” the γ-phosphate of ATP (Figure 3) . In AAA+ family members this motif consists of a conserved arginine located right after β-strand 4, joined by two serine or threonine residues and further upstream by three hydrophobic residues. Within this sequence context, it has been suggested that arginine coordinates nucleotide hydrolysis and conformational changes between subunits . Our analyses revealed that the Sensor 1 motif of Apaf-1 and all NLRs with the exception of Ipaf, NALP4, NALP9 and NALP13 contain this conserved arginine (Table 2). Moreover, we observed that except for NALP4 (AI), NALP8 (MI), and NALP9 (AL), the first two threonines are generally conserved in most NLRs.
The Sensor 2 motif is a feature of AAA+ ATPases and is typically located in the region right after Sensor 1 before β-strand 5. This motif is usually characterized by a conserved arginine or lysine residue involved in nucleotide-binding and hydrolysis. We observed that this specific feature is generally missing in proteins belonging to the STAND class, or at least could not be functionally assigned based on their primary sequence. However, by analyzing the structure of Apaf-1 in its closed form, we observed that a unique feature comes to light. In comparison to other AAA+ ATPases, Apaf-1 displays the involvement of the WH domain in the coordination of ADP, instead of the missing Sensor 2 motif, with H438 and S422 contributing two hydrogen bonds to the coordination of the phosphate groups. Of particular interest in this case is H438, which can be regarded as replacement of the Sensor 2 motif, when compared to the structures of other AAA+ super family members . Our structural alignments of NLR proteins with Apaf-1 reveal that Sensor 2 is also replaced by a conserved histidine in the WH domain of NLR proteins (Table 2, Figure 3). Importantly, we observed that the conserved histidine is part of a highly conserved sequence patch among NLR family members (Figure 1). This patch is characterized by the consensus sequence FxHxxQEhxA, which has been described as a unique feature of the NAIP-like subfamily among the STAND clade  and now points to a common involvement of this patch in NLRs acting in a Sensor 2-like manner. We observed that almost all NLRs harbour this conserved sequence with slight variations concerning the glutamate residue. Exceptions are NALP6, NALP8, NOD5, CIITA, and NAIP, where the conserved histidine is not present (Figure 1). It is not clear whether these NLRs replace the histidine by another feature or are incapable of ATP hydrolysis. As mentioned above, the Sensor 2 motif in AAA+ ATPases is composed of a conserved arginine residue that completes the active site of the neighbor molecule in the oligomer, where it is supposed to be involved in nucleotide-binding . In fact, some NLR proteins such as NAIP, NALP2, NALP4, and NOD1 display an arginine residue downstream of the Sensor 1 motif that could function as a Sensor 2 motif. However, the conserved histidine residue present in the WH domain of NALP2, NALP4, and NOD1 may still be capable to substitute the function of Sensor 2.
1.4 Additional domains and motifs
The GxP signature is a conserved motif located in the small helical subdomain (C-domain)  at the C-terminal region of the NACHT domain (Figure 1) . Interestingly, the conserved proline interacts with the adenine moiety of the bound ATP molecule (Figure 3). Our alignment analyses revealed that most NLRs display this highly conserved proline residue (with the exception of NAIP (T) and NALP 11 (A)), but lack the conserved glycine residue (Figure 1), suggesting a key feature assigned to the proline among NLRs. As described, additional domains following the NACHT domain are the WH domain, also referred to as HETHS domain  containing the conserved histidine motif, and the SH domain, which consists of eight alpha helices in a superhelical arrangement of yet unknown function.
Additional NLR sequence motifs are the cysteine rich region in the NACHT domain, containing a VCWxVCT motif located adjacent to the nucleotide-binding site (Figure 1), which plays a role in nucleotide recognition. Another feature is a highly conserved patch located in the WH domain of Apaf-1. This feature displays the sequence METEEV (Figure 1, Table 2) where the second glutamate is part of the interface to the adjoining CARD domain and forms hydrogen bonds to backbone atoms in the loop connecting helices 3 and 4 of the CARD domain. This interaction, which may lead to the stabilization of the dormant form seems to be conserved in the whole NLR family. In NLR proteins there is in place of the methionine a highly conserved phenylalanine residue (Figure 1, Table 2). Only NALP2, NALP8, and CIITA contain a leucine instead of the phenylalanine residue. Also the glutamate residues are conserved to a certain degree or substituted by an aspartate residue within NLR proteins. These conserved motifs are most likely involved in intra- and intermolecular interactions required for stabilization of the closed form and formation of the active signaling platform.
1.5 Features important for intermolecular interactions and oligomerization
Since our detailed sequence analyses revealed that most NLRs and Apaf-1 share the same domain architecture and many secondary structure features, the availability of structural and mechanistic data for Apaf-1 provides the opportunity to link conserved sequence features of NLRs to functional aspects of NLR signaling. Cytochrome c activated Apaf-1 has been shown to undergo an ATP-hydrolysis-dependent conformational rearrangement in order to form heptamers through an interaction of its NACHT domains. Interestingly, the heptamers were proposed to arrange in a ring-like structure, which is usually found in AAA+ ATPases such as RuvB or NtrC1 –. We consequently generated homology models for the NACHT-WH-SH regions and analyzed the distribution of conserved motifs and residues in order to deduce a putative mechanism for NLR oligomerization (Figure 4A). Although an alternative ring formation has also been proposed , we used the typical AAA+ like arrangement in which the interface is formed of surface residues in the NACHT domain (see orange and magenta boxes in Figures 2).
A NLR oligomerization interface: Apaf-1 oligomer modeled on the basis of the NtrC1 heptamer crystal structure. For clarity only three NACHT domains are shown in ribbon representation with an ADP molecule depicted in sticks to highlight the nucleotide-binding site in each domain. Side-chains of residues in the oligomerization interfaces are shown in sticks color-coded according to the alignment in Figure 1. The two interfaces form across the nucleotide-binding site of the NACHT domain including the GxP domain. B Model of NLR activation and inflammasome formation based on the Apaf-1 apoptosome.
Based on the fact that all structural features required for oligomerization are present (Figure 4B), we hypothesize that NLRs are in principle capable of building signaling platforms like Apaf-1. This suggests that NLRs also use the ring-like arrangement of effector domains to recognize and activate signaling partners.
2. NLR effector domains and their corresponding binding partners
Structural and mutational studies of the CARD domains of Apaf-1 and procaspase-9 have identified the essential motifs for procaspase-9 activation by Apaf-1 . The interface of these two proteins has been shown to be mainly constituted by electrostatic interactions between an acidic and convex surface patch (helices 1 and 4) within the CARD domain of Apaf-1 and by a basic and concave surface patch (helices 2 and 3) within the CARD of procaspase-9. Among this homophilic CARD-CARD interaction, it has been shown that the crucial residues D27, E39, E40, and E41 are localized within the acidic region of Apaf-1 . Furthermore, on the NLR protein NOD1, residues D42, D48, E53, D54, and E56 of the NOD1 CARD were suggested to mediate its interaction with its effector protein RICK. Complementary residues R444, K480, R483, R488 on the CARD domain of RICK were found in the putative interaction surface .
Based on these findings, we examined whether these residues, which are necessary for homophilic CARD-CARD interactions, are conserved among NLR CARDs and the CARDs of their effector proteins, respectively, by means of multiple sequence alignment (Figure 5A and 5B). Although the primary sequence conservation between CARD domains is generally low, we observed that the domains display a high degree of structural homology. Importantly the known interface residues of the homophilic CARD-CARD interactions of NOD1/RICK and Apaf-1/C9 are to a high degree conserved among NLR effector domains, caspases, and adaptor proteins (Table 3). Notably, the first and last residues of the acidic as well as the basic patch are highly conserved among the analyzed CARD domains, suggesting a pattern of interaction similar to the one described for Apaf-1/C9 or NOD1/RICK. These observations imply that the main principle of CARD-CARD interactions is based on the engagement of an acidic patch built of helices 1 and 4, with a basic patch composed of helices 2 and 3. However, the surrounding residues within this interface most likely define the specificity for interactions between CARD domains thereby ensuring the selectivity for the right interaction partner.
A Multiple sequence alignment of NLR and Apaf-1 CARD domains. Acidic key residues participating in the CARD-CARD interface are indicated by red borders. Residues that belong to the basic patch of the CARD-CARD interface are indicated by blue borders. Nod2.1 and Nod2.2 refer to NOD2 CARD domain 1 and 2, respectively. B Multiple sequence alignment of NLR PYRIN domains. Patch of negatively charged residues from ASC2 in helices 1 and 4 and their corresponding residues in the PYRIN domain containing NLR proteins (red box). Patch of positively charged residues from ASC2 in helices 2 and 3 and their corresponding residues in the PYRIN containing NLR proteins (blue box).
To date, neither information from crystal structures nor mutational analysis of PYRIN domains or PYRIN-PYRIN interactions have been reported . However, utilizing NMR, a recent report observed a highly bipolar organization of the human ASC and ASC2 PYRIN domains , , revealing that they resemble the molecular surface properties of CARD domains. These tertiary structure similarities between PYRIN and CARD domains indicate that like for CARD-CARD interactions, an electrostatic interface may play an important role for the biochemical properties and the interaction behavior of PYD-containing molecules , . Based on this hypothesis, we propose that the already described interaction between the CARDs of Apaf-1 and caspase-9 can be utilized as a working model for PYRIN-PYRIN interactions as well. Following on this suggestion, one would expect that the residues in helices 2 and 3 of one PYD build an interface with the residues in helices 1 and 4 of a complementary PYD .
By utilizing multiple sequence alignments (Figure 5A and 5B) of both, CARD and PYD domains, we observed that the residues involved in the homophilic domain interfaces are conserved among the NLR family. However, mutational studies showed that these residues, which are important in a certain CARD-CARD interaction, are dispensable in the homotypic interaction of other proteins (e.g. the D42 mutant in Nod1 does not impair binding to Rick, but its corresponding residue in Apaf-1 is essential for its interaction with procaspase-9).
The proposed model of CARD-CARD and PYD-PYD interactions is that the acidic patch of one domain interacts with the basic patch of the other protein. Hydrophobic residues of adjacent regions are also suggested to be important in this interaction. Nevertheless, it is not clear so far, if there is a limited repertoire of structurally conserved motifs that may mediate interactions among death domain superfamily members. Therefore, more structural studies and mutational analysis of complexes built of those domains are necessary to define the motifs and interacting residues involved.
3. The LRR receptor domain
Similarly to the WD-40 repeats in Apaf-1, leucine-rich repeats are the ligand sensing motif of NLR proteins, a property they share with members of the TLR and RLR (RIG-I-like receptors) families. LRRs in general consist of 2–45 motifs of 20–30 amino acids in length and exhibit a typical curved horseshoe-like structure with a parallel beta sheet on the concave side and helical elements on the convex side .
In NLRs the C-terminal LRR domain is thought to act as a sensor of bacterial products. Yet, little is known about how the PAMP is interacting with the LRR or even how the LRR region interacts with the remainder of the NLR, since no structural data is available on these questions. Recently, some insight into the possible mechanism of ligand-receptor binding was provided by the two LRR-ligand complex structures of TLR1:TLR2  and TLR4:MD2 . Within the proposed LRR-ligand complex, the ligand-binding site is located at the concave surface of the LRR domain.
In order to augment our understanding of the molecular mechanism of ligand recognition we generated a homology model of the NOD2 LRR domains based on the structure of the ribonuclease inhibitor (aa1–413, PDB id: 1bnh, seqID 33%) as a template. Additionally, we utilized the Consurf Server for the identification of functional regions in NLRs by surface mapping of phylogenetic information. Figure 6A shows the modeled LRR domains of NOD2 with highly conserved residues in the human NLR family colored in green and non-conserved residues shown in white. The figure clearly shows an extensive patch of conserved residues spanning the surface hinting to a function of these residues in signal sensing or the activation mechanism.
A Homology model of NOD2 LRR domain based on the ribonuclease inhibitor (pdb id: 1bnh). Predicted structure of NOD2 LRR domains with conserved residues shown in green and non-conserved residues in white. B Position of loss-of-function mutations shown in darkblue. Mutations found in CD patients are depicted in red. (Core forming residues that do not contribute to the ligand-binding patch are highlighted in lightblue). C NOD2 homology model based on templates Apaf-1 (aa1–581, PDB id: 1z6t, seqID 11%) and ribonuclease inhibitor (aa1–413, PDB entry: 1bnh, seqID 33%) (33). NOD2 model depicted as a cartoon color-coded according to domain structure: blue CARD2 aa95–182, green linker aa183–240, purple NACHT aa241–465, orange winged helix and superhelical domain aa466–734, green-yellow LRR aa735–1040). An ADP molecule bound in the ATPase active site is depicted in sticks. The position of the SNP mutations P268S, R702W, and G908R, are shown as red spheres. The truncation due to mutation L1007fsinsC is colored red.
Our findings are in accordance with recently published work by Tanabe and colleagues, which showed loss of function mutations in the LRR domain to be located on the convex surface with additional residues on the concave region . However, only those residues that are predicted to contribute to the convex surface are conserved in the corresponding regions of LRR proteins, whereas the residues on the concave surface are not (Figure 6A and 6B). On the other hand the loss-of-function mutations on the outer surface in the LRR domain do not form a continuous patch. They are scattered all over the molecule and are therefore not likely to form the ligand-binding site.
Our homology model suggests a putative ligand-binding pocket situated in the concave surface and supports earlier observations, where the predicted loss-of function-mutations W907L, V935M, E959K, C961Y, K989E, S991F as well as the Crohn's disease related mutation G908 have been mapped to the same area . These amino acid residues do form a contiguous patch and therefore may point to the putative ligand-binding site (Figure 6B). Supporting this, is the fact that the location of this particular surface patch corresponds to ligand-binding sites in other LRR proteins –. Taken together, these results point to a common putative binding pocket located at the concave surface of the LRR, which, however, differs from protein to protein. Whether the patches on the convex surface do contribute to ligand-binding or eventually contribute to locking the NLR proteins in the dormant form remains to be further investigated.
Disease derived mutations of NLRs: implication on NLR function.
Several diseases were found to arise from aberrant NLR function –. More accurately, they are caused by SNPs (single nucleotide polymorphisms) leading to point mutations in NLR genes. One particular intriguing SNP is SNP5 in NOD2 leading to Crohn's disease. To analyze the position of the SNP5 mutation P268S within NOD2, a NOD2 homology model was created based on templates Apaf-1 (aa1–581, PDB id: 1z6t, seqID 11%) and ribonuclease inhibitor (aa1–413, PDB id: 1bnh, seqID 33%). The SNP5 mutation P268S resides in the linker region before the first helix of the NACHT domain (Figure 6C). P268 constitutes part of the nucleotide-binding interface where it interacts with the adenine moiety of ADP. P268S disturbs the backbone conformation of the linker thus interferes with nucleotide binding and may alter the affinity and hydrolysis rate of the nucleotide-binding domain. Hence, SNP5 impairs the fine-tuned conformational states of the active-inactive balance of the NOD2 receptor and has therefore most likely a direct impact on its signaling properties.
In summary, our study clearly shows that the overall architecture and secondary structure features of most NLRs resemble those of Apaf-1. From the structural point of view, most of the NLR family members are therefore Apaf-1-like, with deviations including NOD2 (2 CARD domains), NALP1 (additional FIIND and CARD domain), NALP10 (missing LRR region), NOD5 and CIITA (undefined N-terminal region). Analyses of multiple sequence alignments revealed that all NLRs contain the crucial features for ATP-binding. In comparison to Apaf-1, most NLRs display a modified Walker B box. Since all NLRs, except for NAIP and NALP11, do not contain the crucial Walker B glutamate or aspartate required to activate the water molecule, they seem to have developed a new motif, the extended Walker B box to retain ATP hydrolysis activity. This is supported by the observation that NALP12 is able to bind and hydrolyze ATP . Thus, our sequence analysis now provides the basis for further studies to elucidate whether the modified/extended Walker B box is functional.
Additionally, we identified one of the most intriguing features, which is the conserved histidine in the WH-domain, to be conserved among members of the NLR family. NLRs displaying this feature most likely assemble similar to Apaf-1 and activate their targets by oligomerization. Interestingly, NOD5, CIITA, NALP6, and NALP8 do not contain the conserved histidine in their WH domain. Whether their oligomerization mechanism and ATP hydrolysis capacity differ remains an open issue.
Our analyses of the effector domains of NLRs as well as those of their adaptors and target caspases, or kinases reveal a common interface, which is composed of charged surface patches. The presence of acidic and basic surface patches theoretically renders all CARD and PYD domains compatible for interaction with each other. Yet their distinct profile and that of surrounding residues found in the described interfaces ensure the specificity for each interaction. This selectivity allows a well-balanced fine-tuning of the elicited immune response.
Finally, sequence comparison of LRRs in human NLRs does not reveal one particular region that serves as the general ligand-binding site. This suggests that individual NLRs evolved highly specialized modes to recognize specific ligands. However, conserved residues found within this domain may contribute to the intramolecular interaction or backfolding of the LRR region in order to regulate NLR activation. Our results serve as a basis for further mutational and functional analyses required to more precisely define the role of LRRs in ligand recognition and NLR activation.
NLR protein sequences (see Table 1) were submitted to profile-sequence searches with the FFAS server (http://ffas.ljcrf.edu) . Secondary structure prediction was done using the predictprotein server (http://www.predictprotein.org) . We analyzed the human sequences for NACHT domain paralogues (about 410 residues). Multiple alignments were created using muscle  and m-coffee  with default options in the aforementioned sequences and the Apaf-1 sequence. The alignment was manually adjusted according to secondary structure prediction.
The alignments were used to run phylogenetic probabilistic analyses using the parallel implementation of MrBayes . The sequence of Apaf-1 was used to root the tree in all cases. A total number of 200000 generations were run in 4 independent chains. The model used to set the priors for amino acid data was an average of all the available models and a sample was obtained each 10 generations. Once convergence was reached, a total of a credible 6973 trees were sampled and clade credibility values (probabilities) calculated. In order to check how the paralogues arrange in a bigger tree, homologous sequences were retrieved from Uniprot databases from close organisms. The new sequences (31) were re-aligned to the original multiple alignment using T-coffee. To keep the clarity of the tree, we used a final number of 54. As in previous cases, 200000 generations were run. The frequency of sampling was each 10 generations. A total of credible 2777 trees were then sampled.
The protein structure with the highest scoring alignment from the FFAS-search was used as a template for modeling the structures with the SCWRL-Server (http://www1.jcsg.org/scripts/prod/scwrl) . Models were evaluated using PROSA (https://prosa.services.came.sbg.ac.at/prosa.php) manually inspected, analyzed, and figures were prepared using Pymol (http://pymol.sourceforge.net). The degree of conservation was calculated using the ConSurf Server (http://consurf.tau.ac.il).
Conceived and designed the experiments: RS MP. Performed the experiments: MP. Analyzed the data: RS AR. Contributed reagents/materials/analysis tools: SR JF AR. Wrote the paper: RS MP. Other: Revised manuscript: JF SR.
- 1. Meylan E, Tschopp J, Karin M (2006) Intracellular pattern recognition receptors in the host response. Nature 442: 39–44.
- 2. Fritz JH, Ferrero RL, Philpott DJ, Girardin SE (2006) Nod-like proteins in immunity, inflammation and disease. Nat Immunol 7: 1250–1257.
- 3. Werts C, Girardin SE, Philpott DJ (2006) TIR, CARD and PYRIN: three domains for an antimicrobial triad. Cell Death Differ 13: 798–815.
- 4. Inohara N, Nunez G (2003) NODs: intracellular proteins involved in inflammation and apoptosis. Nat Rev Immunol 3: 371–382.
- 5. Mariathasan S, Newton K, Monack DM, Vucic D, French DM, et al. (2004) Differential activation of the inflammasome by caspase-1 adaptors ASC and Ipaf. Nature 430: 213–218.
- 6. Inohara , Chamaillard , McDonald C, Nunez G (2005) NOD-LRR proteins: role in host-microbial interactions and inflammatory disease. Annu Rev Biochem 74: 355–383.
- 7. Faustin B, Reed JC (2008) Sunburned skin activates inflammasomes. Trends Cell Biol 18: 4–8.
- 8. Tattoli I, Travassos LH, Carneiro LA, Magalhaes JG, Girardin SE (2007) The Nodosome: Nod1 and Nod2 control bacterial infections and inflammation. Semin Immunopathol 29: 289–301.
- 9. Lu C, Wang A, Wang L, Dorsch M, Ocain TD, et al. (2005) Nucleotide binding to CARD12 and its role in CARD12-mediated caspase-1 activation. Biochem Biophys Res Commun 331: 1114–1119.
- 10. Duncan JA, Bergstralh DT, Wang Y, Willingham SB, Ye Z, et al. (2007) Cryopyrin/NALP3 binds ATP/dATP, is an ATPase, and requires ATP binding to mediate inflammatory signaling. Proc Natl Acad Sci U S A 104: 8041–8046.
- 11. Colonna M (2007) All roads lead to CARD9. Nat Immunol 8: 554–555.
- 12. Martinon F (2007) Orchestration of pathogen recognition by inflammasome diversity: Variations on a common theme. Eur J Immunol 37: 3003–3006.
- 13. Martinon F, Gaide O, Petrilli V, Mayor A, Tschopp J (2007) NALP inflammasomes: a central role in innate immunity. Semin Immunopathol 29: 213–229.
- 14. Albrecht M, Takken FL (2006) Update on the domain architectures of NLRs and R proteins. Biochem Biophys Res Commun 339: 459–462.
- 15. Jaroszewski L, Rychlewski L, Li Z, Li W, Godzik A (2005) FFAS03: a server for profile--profile sequence alignments. Nucleic Acids Res 33: W284–288.
- 16. Rost B, Yachdav G, Liu J (2004) The PredictProtein server. Nucleic Acids Res 32: W321–326.
- 17. Schultz J, Milpetz F, Bork P, Ponting CP (1998) SMART, a simple modular architecture research tool: identification of signaling domains. Proc Natl Acad Sci U S A 95: 5857–5864.
- 18. Ammelburg M, Frickey T, Lupas AN (2006) Classification of AAA+ proteins. J Struct Biol 156: 2–11.
- 19. Leipe DD, Koonin EV, Aravind L (2004) STAND, a class of P-loop NTPases including animal and plant regulators of programmed cell death: multiple, complex domain architectures, unusual phyletic patterns, and evolution by horizontal gene transfer. J Mol Biol 343: 1–28.
- 20. Hu Y, Ding L, Spencer DM, Nunez G (1998) WD-40 repeat region regulates Apaf-1 self-association and procaspase-9 activation. J Biol Chem 273: 33489–33494.
- 21. Muhlethaler-Mottet A, Otten LA, Steimle V, Mach B (1997) Expression of MHC class II molecules in different cellular and functional compartments is controlled by differential usage of multiple promoters of the transactivator CIITA. Embo J 16: 2851–2860.
- 22. Iyer LM, Leipe DD, Koonin EV, Aravind L (2004) Evolutionary history and higher order classification of AAA+ ATPases. J Struct Biol 146: 11–31.
- 23. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32: 1792–1797.
- 24. Hanson PI, Whiteheart SW (2005) AAA+ proteins: have engine, will work. Nat Rev Mol Cell Biol 6: 519–529.
- 25. Ye Z, Lich JD, Moore CB, Duncan JA, Williams KL, et al. (2007) ATP binding by Monarch-1/NLRP12 is critical for its inhibitory function. Mol Cell Biol.
- 26. Yan N, Chai J, Lee ES, Gu L, Liu Q, et al. (2005) Structure of the CED-4-CED-9 complex provides insights into programmed cell death in Caenorhabditis elegans. Nature 437: 831–837.
- 27. Rodriguez A, Oliver H, Zou H, Chen P, Wang X, et al. (1999) Dark is a Drosophila homologue of Apaf-1/CED-4 and functions in an evolutionarily conserved death pathway. Nat Cell Biol 1: 272–279.
- 28. Ogura T, Whiteheart SW, Wilkinson AJ (2004) Conserved arginine residues implicated in ATP hydrolysis, nucleotide-sensing, and inter-subunit interactions in AAA and AAA+ ATPases. J Struct Biol 146: 106–112.
- 29. Riedl SJ, Li W, Chao Y, Schwarzenbacher R, Shi Y (2005) Structure of the apoptotic protease-activating factor 1 bound to ADP. Nature 434: 926–933.
- 30. Yamada K, Kunishima N, Mayanagi K, Ohnishi T, Nishino T, et al. (2001) Crystal structure of the Holliday junction migration motor protein RuvB from Thermus thermophilus HB8. Proc Natl Acad Sci U S A 98: 1442–1447.
- 31. Lee SY, De La Torre A, Yan D, Kustu S, Nixon BT, et al. (2003) Regulation of the transcriptional activator NtrC1: structural studies of the regulatory and AAA+ ATPase domains. Genes Dev 17: 2552–2563.
- 32. Diemand AV, Lupas AN (2006) Modeling AAA+ ring complexes from monomeric structures. J Struct Biol 156: 230–243.
- 33. Yu X, Wang L, Acehan D, Wang X, Akey CW (2006) Three-dimensional structure of a double apoptosome formed by the Drosophila Apaf-1 related killer. J Mol Biol 355: 577–589.
- 34. Qin H, Srinivasula SM, Wu G, Fernandes-Alnemri T, Alnemri ES, et al. (1999) Structural basis of procaspase-9 recruitment by the apoptotic protease-activating factor 1. Nature 399: 549–557.
- 35. Manon F, Favier A, Nunez G, Simorre JP, Cusack S (2007) Solution structure of NOD1 CARD and mutational analysis of its interaction with the CARD of downstream kinase RICK. J Mol Biol 365: 160–174.
- 36. Natarajan A, Ghose R, Hill JM (2006) Structure and dynamics of ASC2, a pyrin domain-only protein that regulates inflammatory signaling. J Biol Chem 281: 31863–31875.
- 37. Liepinsh E, Barbals R, Dahl E, Sharipo A, Staub E, et al. (2003) The death-domain fold of the ASC PYRIN domain, presenting a basis for PYRIN/PYRIN recognition. J Mol Biol 332: 1155–1163.
- 38. Enkhbayar P, Kamiya M, Osaki M, Matsumoto T, Matsushima N (2004) Structural principles of leucine-rich repeat (LRR) proteins. Proteins 54: 394–403.
- 39. Jin MS, Kim SE, Heo JY, Lee ME, Kim HM, et al. (2007) Crystal structure of the TLR1-TLR2 heterodimer induced by binding of a tri-acylated lipopeptide. Cell 130: 1071–1082.
- 40. Kim HE, Du F, Fang M, Wang X (2005) Formation of apoptosome is initiated by cytochrome c-induced dATP hydrolysis and subsequent nucleotide exchange on Apaf-1. Proc Natl Acad Sci U S A 102: 17545–17550.
- 41. Tanabe T, Chamaillard M, Ogura Y, Zhu L, Qiu S, et al. (2004) Regulatory regions and critical residues of NOD2 involved in muramyl dipeptide recognition. Embo J 23: 1587–1597.
- 42. Kobe B, Deisenhofer J (1995) A structural basis of the interactions between leucine-rich repeats and protein ligands. Nature 374: 183–186.
- 43. Papageorgiou AC, Shapiro R, Acharya KR (1997) Molecular recognition of human angiogenin by placental ribonuclease inhibitor–an X-ray crystallographic study at 2.0 A resolution. Embo J 16: 5162–5177.
- 44. Price SR, Evans PR, Nagai K (1998) Crystal structure of the spliceosomal U2B”-U2A' protein complex bound to a fragment of U2 small nuclear RNA. Nature 394: 645–650.
- 45. Miceli-Richard C, Lesage S, Rybojad M, Prieur AM, Manouvrier-Hanu S, et al. (2001) CARD15 mutations in Blau syndrome. Nat Genet 29: 19–20.
- 46. Hugot JP, Chamaillard M, Zouali H, Lesage S, Cezard JP, et al. (2001) Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn's disease. Nature 411: 599–603.
- 47. Hysi P, Kabesch M, Moffatt MF, Schedel M, Carr D, et al. (2005) NOD1 variation, immunoglobulin E and asthma. Hum Mol Genet 14: 935–941.
- 48. Hoffman HM, Mueller JL, Broide DH, Wanderer AA, Kolodner RD (2001) Mutation of a new gene encoding a putative pyrin-like protein causes familial cold autoinflammatory syndrome and Muckle-Wells syndrome. Nat Genet 29: 301–305.
- 49. Feldmann J, Prieur AM, Quartier P, Berquin P, Certain S, et al. (2002) Chronic infantile neurological cutaneous and articular syndrome is caused by mutations in CIAS1, a gene highly expressed in polymorphonuclear cells and chondrocytes. Am J Hum Genet 71: 198–203.
- 50. Wallace IM, O'Sullivan O, Higgins DG, Notredame C (2006) M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res 34: 1692–1699.
- 51. Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17: 754–755.
- 52. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410.