Figure 1.
(a) Schematic view of the domains and motifs in cereblon. The Tyr and Trp residues of the CULT domain whose mutation abolishes thalidomide binding (Ito et al., 2010) are marked. (b) Annotated sequence of human cereblon. The two LON domains and the CULT domain are colored as in panel (a). Highly conserved sequence motifs outside these domains are colored red. The two regions identified by deletion analysis as responsible for DDB1 binding (mid-protein) and thalidomide binding (C-terminal) are highlighted in grey. Sequences with high helical propensity in the putative DDB1-binding region are underlined. (c) Model of cereblon. The LON domain was modeled on PDB:1ZBO. The N-terminal extension and the large connector between the LON subdomains are marked by dotted lines. The CULT domain was modeled on the structure of a bacterial CULT domain which we have determined [18]. Note that the CULT domain of human cereblon can also be modeled with good accuracy on the homologous structures of MsrB and RIG-I (Fig. S1). After submission of this manuscript, two cereblon-DDB1 complex structures [15], [16] have confirmed the domain arrangement proposed in this model.
Figure 2.
Cluster map of CULT domain proteins.
The main clusters are named as described in the text and the domain architecture of the proteins in the respective cluster is shown. For this map, we searched the nr database at NCBI with PSI-Blast, using the CULT domain of human cereblon as a query. After convergence, we extracted all proteins above the cutoff of E = 0.005 and clustered them in CLANS using their all-against-all pairwise similarities as measured by BLAST P-values. Clustering was done to equilibrium in 2D at a P-value cutoff of 1e-10 using default settings.
Figure 3.
CULT domain sequence and structure.
(a) Multiple alignment of CULT domains from representative members of the groups in Fig. 2. The alignment is based on the results of the PSI-Blast search with the CULT domain of human cereblon (first sequence in the alignment). Invariant residues of the three core groups (cereblon, secreted eukaryotic, bacterial) are underscored in black, residues conserved in at least two thirds of the sequences in the alignment are highlighted in dark grey and residues in at least one third of the sequences in light grey. The three tryptophan residues forming the thalidomide-binding site are marked by arrowheads and the two cysteine motifs coordinating the Zn ion, as well as a highly conserved motif at the tip of the inserted β-hairpin, are written out. The secondary structure above the alignment (S = β-strand) is the experimentally determined structure of the CULT domain from MGR_0879 of Magnetospirillum gryphiswaldense (first bacterial sequence in the alignment; [18]). The β-strands of the two main β-sheets are numbered according to the consensus structure of the β-tent fold and colored by whether they belong to the N-terminal β-sheet (purple) or the C-terminal one (gold); β3 is shown in brackets as it has lost its β-strand character in the CULT domain. The two β-strands of the inserted hairpin (teal) are labeled βI1 and βI2. The sequences are: (cereblon) HS - Homo sapiens NP_057386.2, DR - Danio rerio NP_001003996.1, DM - Drosophila mojavensis XP_001999319.1, CE - Caenorhabditis elegans NP_502300.2, AT - Arabidopsis thaliana NP_850069.1, EH - Entamoeba hystolitica XP_657530.1; (secreted eukaryotic) DR - Danio rerio NP_001121712.1, CB - Caenorhabditis brenneri EGT59438.1, TA - Trichoplax adhaerens XP_002115135.1, NV - Nasonia virtipennis XP_003427162.1; (bacterial) MG - Magnetospirillum gryphiswaldense CAM74667.1, GP - gamma proteobacterium BDW918 WP_008249149.1, HO - Haliangium ochraceum WP_012831591.1, DT - Desulfonatronospira thiodismutans WP_008870657.1, LS - Leptospira sp. B5-022 WP_020769190.1; (oomycete) PI - Phytophthora infestans XP_002999235.1, AL - Albugo laibachii CCA16326.1; (kinetoplastid) LM - Leishmania major strain Friedlin XP_001681231.1, TC - Trypanosoma cruzi EKG02463.1; (other) BP - Bathycoccus prasinos XP_007508760.1. (b) Superimposition of bacterial and eukaryotic CULT domain structures from M. gryphiswaldense (red), H. sapiens (blue; PDB ID:4TZ4), M. musculus (orange; PDB ID: 4TZC), and G. gallus (green; PDB ID:4CI2). The r.m.s. deviations in Cα positions for the pairwise comparisons ranged between 0.4 and 0.9 Å and are listed in S1 Table. (c) Superimposition of the thalidomide binding site for the structures in panel (b). The ligand and the residues of the aromatic cage are shown in stick representation and colored as in panel (b). Residue numbering is for the human protein.
Figure 4.
Sequence relationships between proteins of the β-tent fold.
The map shows the probabilities obtained for HMM to HMM comparisons, as implemented in HHpred. Queries are in rows, targets in columns. The proteins are: cereblon - CULT domain of human cereblon; Yippee - Drosophila melanogaster yippee isoform A (AAF48266.1); Mis18 - Schizosaccharomyces pombe Mis18 (CAB72327.2), res. 1–125; RIG-I - PDB: 4A2V; MsrB - PDB: 3HCG; GFA - PDB: 1X6M; MSS4 - PDB: 2FU5; TCTP - PDB: 1YZ1; Duf 427 - PDB: 3DJM.
Figure 5.
Cluster map of CULT domain homologs.
The image is a cross-eyed stereo view and shows that the five domain families occupy the vertices of an approximately equilateral pyramid. For this map, we searched the nr database at NCBI with PSI-Blast, using the CULT domain protein MGR_0879 of M. gryphiswaldense as a query. After six iterations, we extracted all proteins above the cutoff of E = 0.005 and clustered them in CLANS using their all-against-all pairwise similarities as measured by BLAST P-values. Clustering was done to equilibrium in 3D at a P-value cutoff of 1e-5 using default settings.
Figure 6.
Structure gallery and evolutionary inference for proteins with a β-tent fold.
Only the core fold is shown; in structures where additional parts of the polypeptide chain obscured the view on the core fold, these were omitted. The β-strands of the insertion between β2 and β3 of the N-terminal β-meander is colored orange. The top image shows a superposition of the two halves of the MsrB structure 3HCJ (r.m.s.d. of 1 Å over the Cα carbons of the 30 superimposable residues) and the central image shows 3HCJ itself, as the most symmetrical of the β-tent structures. The images around the circumference show the seven domains of known structure discussed in this article (see also Fig. S2). Of these, DUF427 and TCTP systematically lack a zinc binding site, and MsrB homologs have occasionally lost it. In the other domains, the zinc binding site is essentially always present, although the cysteine pattern is slightly modified in GFA relative to all other domains, the first cysteine tandem being CxCxxx, rather than xxCxxC. The arrows in the figure show our inference for a possible evolutionary path. The fold could thus have originated by duplication of a four-stranded β-meander and subsequently diverged into the domains seen today. Where homologous relationships are supported by sequence similarity, the arrows are black; otherwise they are grey.
Figure 7.
Gallery of β-tent domain binding sites.
The panels show the β-hairpin inserted between β2 and β3, and the C-terminal β-meander (β4–β7), colored as in Fig. 6. Residues involved in ligand binding are colored magenta and blue, and the ligands cyan. Yippee is the molecular model from Fig. S1 and the ligand-binding residues are predicted based on conservation and location in the fold. For GFA, only the residues involved in binding the glutatione cofactor are known (magenta). Six highly conserved histidine residues (blue) may participate in catalysis, but the exact location and geometry of the formaldehyde-binding site is unknown. For Rig-I, the residues involved in coordinating the RNA 3′ end are shown in magenta and the 5′ end in blue. Residue numbers are: Cereblon: P51, W79, W85, W99, Y101, Yippee: Y43, F45, W82, Y84, MsrB: W73, R97, H111, F113, GFA: C54, T57, L58, C56, R98 (magenta) and H32, H50, H52, H107, H117, H126 (blue), RIG-I: E573, H576, W604, K602 (magenta) and V632, L621, V595, I597, F601, W604 (blue).
Figure 8.
Comparison of the substrate-binding sites in CULT and MsrB.
The superimposition shows the thalidomide binding sites in the CULT domain structures (M. gryphiswaldense MGR_0879, G. gallus 4CI2, M. musculus 4TZC) in gray and the methionine sulfoxide binding site of MsrB (X. campestris 3HCI) in red. The ligands are colored cyan (thalidomide) and marine blue ((2S)-2-(acetylamino)-N-methyl-4-[(R)-methylsulfinyl]butanamide). Residue numbers are provided for the M. gryphiswaldense structure in gray, and for MsrB in red.
Figure 9.
Structural gallery of aromatic cage-like conformations and associated ligands.
Aromatic residues forming cage-like conformations are depicted in gray, ligands in cyan. Oxygen: red; nitrogen: blue; sulphur: yellow. Panels b–h show examples from the five main categories (S2 Table); their PDB accession codes are shown at the bottom-right corner of each panel. Cage residues are labeled in red by their three-letter names and residue numbers from the PDB entry. Ligands are marked by their three-letter identifiers taken from PDB. (a) Geometric criteria for detecting aromatic cage-like conformations (see Method section for a detailed description). (b) Heterocyclic ring (uracil) bound to the pyrimidine-sensing transcriptional regulator RutR. (c) Hydrocarbon ring (tyrosine) bound to the periplasmic leucine-binding protein of E. coli. (d) Hydrocarbon chain (leucine) bound to the same binding site (1a99). (e) Hydrocarbon chain containing N/O/S (putrescine) bound to the periplasmic putrescine-binding protein of E. coli. (f) Hydrocarbon ring (2-(cyclohexylamino)benzoic acid) bound to the phenazine biosynthesis protein PhzA/B from Burkholderia cepacia R18194. (g) Ammonium cation (metyllysine 9 of histone H3) bound to the Drosophila HP1 chromodomain. (h) Ammonium cation (dimethylarginine) bound to the Tudor domain of human TDRD3.