Structural Analysis of the Rubisco-Assembly Chaperone RbcX-II from Chlamydomonas reinhardtii

The most prevalent form of the Rubisco enzyme is a complex of eight catalytic large subunits (RbcL) and eight regulatory small subunits (RbcS). Rubisco biogenesis depends on the assistance by specific molecular chaperones. The assembly chaperone RbcX stabilizes the RbcL subunits after folding by chaperonin and mediates their assembly to the RbcL8 core complex, from which RbcX is displaced by RbcS to form active holoenzyme. Two isoforms of RbcX are found in eukaryotes, RbcX-I, which is more closely related to cyanobacterial RbcX, and the more distant RbcX-II. The green algae Chlamydomonas reinhardtii contains only RbcX-II isoforms, CrRbcX-IIa and CrRbcX-IIb. Here we solved the crystal structure of CrRbcX-IIa and show that it forms an arc-shaped dimer with a central hydrophobic cleft for binding the C-terminal sequence of RbcL. Like other RbcX proteins, CrRbcX-IIa supports the assembly of cyanobacterial Rubisco in vitro, albeit with reduced activity relative to cyanobacterial RbcX-I. Structural analysis of a fusion protein of CrRbcX-IIa and the C-terminal peptide of RbcL suggests that the peptide binding mode of RbcX-II may differ from that of cyanobacterial RbcX. RbcX homologs appear to have adapted to their cognate Rubisco clients as a result of co-evolution.


Introduction
Life on earth depends on fixation of atmospheric CO 2 into organic compounds by bacteria, algae and plants. The key enzyme for this process ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) catalyzes the carboxylation of the five-carbon sugar ribulose-1,5-bisphosphate (RuBP) which is converted into two molecules of 3-phosphoglycerate. The other enzymes of the Calvin-Benson-Bassham cycle subsequently use reduction equivalents and ATP produced in the light reaction of photosynthesis to regenerate RuBP and produce triose phosphate to fuel anabolic pathways. The most prevalent form of Rubisco (form I) consists of a complex of eight catalytic large subunits (RbcL), forming a D4-symmetric core, and eight resulting in the following constructs: pHueCrRbcX-IIa ; pHueCrRbcX-IIa ; pHueCrRbcX-IIa ; pHueCrRbcL(462-474)-RbcX-IIa . The cleavage site for the chloroplast transit peptide of CrRbcX-IIa was predicted based on homology with AtRbcX-II (see Fig 1). The Quik-Change protocol (Stratagene) was used to produce the mutant pHueCrRbcX-IIa(33-189)(R118A). All plasmid inserts were verified by DNA sequencing.

Expression and Purification of CrRbcX-IIa
RbcX proteins were expressed as N-terminal His 6 -ubiquitin (His 6 -Ub) fusion proteins in E. coli BL21 (DE3) cells from pHue expression plasmids. Cells were grown to an OD600 of 0.5 at 37°C in LB medium followed by induction for 16 h with 0.5 mM isopropyl-D-thiogalactoside (IPTG) at 23°C. Cells were lysed in 50 mM Tris-HCl pH 8.0, 20 mM NaCl, 1 mM EDTA, 0.5 mg/ml lysozyme and 5 mM phenylmethylsulfonyl fluoride (PMSF) for 30 min on ice, followed by ultrasonication (Misonix Sonicator 3000). The supernatant obtained after high-speed centrifugation (48 000 x g, 40 min, 4°C) was applied to a Ni-IMAC column (GE Biotech) to capture the His 6 -Ub protein, followed by overnight cleavage of the His 6 -Ub moiety at 23°C using the deubiquitinating enzyme Usp2 [21]. All subsequent steps were performed at 4°C. The supernatant was dialyzed against buffer A (20 mM Tris-HCl pH 8.0, 50 mM NaCl) and applied to a pre-equilibrated MonoQ column (GE Biotech). Proteins were eluted with a linear salt gradient from 50 mM to 1 M NaCl. Fractions containing RbcX were combined and concentrated, 5% glycerol was added, followed by flash-freezing in liquid nitrogen and storage at -80°C.
RbcX for crystallographic studies was purified further by Superdex200 (GE Biotech) size exclusion chromatography in buffer A. Protein concentration was determined spectrophotometrically at 280 nm using calculated extinction coefficients.
For selenomethionine (SeMet) labeling by the catabolite repression method [23], the bacteria were grown to mid-log phase at 37°C in M9 medium containing 100 mg L -1 ampicillin. Methionine biosynthesis repression was induced by addition of amino acids as follows: 125 mg L -1 L-Lys, 100 mg L -1 L-Phe, 100 mg L -1 L-Tyr, 50 mg L -1 L-Ile, 50 mg L -1 L-Leu, 50 mg L -1 L-Val and 60 mg L -1 L-Se-Met. 15 min later the temperature was reduced to 23°C and protein synthesis was induced with 0.5 mM IPTG for 20 h. Cells were harvested and re-suspended in lysis buffer (50 mM Na-phosphate pH 9.0, 300 mM NaCl, 10 mM imidazole and 1 mM β-mercaptoethanol) containing Complete protease (Roche Biotech) inhibitor cocktail. The cells were disrupted by ultrasonication and SeMet-labeled His 6 -Ub RbcX was purified essentially as described above. The protein solution was dialyzed against buffer A containing 1 mM β-mercaptoethanol (β-ME) and applied to a pre-equilibrated MonoQ column. Proteins were eluted with a linear salt gradient from 50 to 400 mM NaCl. Fractions containing SeMet-labeled CrRbcX-IIa(34-156) were subsequently dialyzed against buffer A/β-ME and concentrated. After flash-freezing in liquid N 2 , the protein was stored at -80°C.

Structure Solution and Refinement
The diffraction data were collected at beamline X10SA of the Swiss Light Source (SLS) in Villigen, Switzerland. Diffraction data were integrated and scaled with XDS [25]. Pointless [26], Scala [27] and Truncate [28] were used to convert the data to CCP4 format, as implemented in the CCP4i interface [29].
The structure of CrRbcX-IIa(34-156) was solved by Se-SAD using crystals from SeMetlabeled protein at 2.0 Å resolution. 36 selenium sites were found by direct methods using residues are shown in red and identical residues in white using bold lettering on red background. Blue frames indicate homologous regions. The consensus sequence is shown at the bottom. The forward arrow designates the beginning of the mature RbcX-II proteins. The diamond symbol at the end of the CrRbcX-IIb sequence indicates that the sequence continues with 130 amino acids not displayed. Asterisks denote residues known to be essential for RbcX function.
doi:10.1371/journal.pone.0135448.g001 SHELXD as implemented in HKL2MAP [30,31]. SHELXE was used for density modification and auto-building of a poly-alanine model. The resulting map was readily interpretable and the sequence was docked using Coot [32]. The final model was created by iterative Coot model building and Refmac5 refinement cycles [33]. The structure of the fusion protein CrRbcL(462-474)-RbcX-IIa(37-156) was solved by molecular replacement using Molrep [34], and the models modified and refined as above. Residues facing solvent channels with disordered side chains were modeled as alanines. Coordinates were aligned with Lsqkab and Lsqman [35]. Figures were generated with the program PyMOL [36] and ESPript [37]. Coordinates and structure factor amplitudes were deposited to Protein Data Bank under accession codes 5BS1 and 5BS2.

Structural Analysis of Chlamydononas reinhardtii RbcX
The genome of C. reinhardtii contains no RbcX-I, but instead has two RbcX-II genes, g688.t1 (locus Cre01.g030350) and g7885.t1 (locus Cre07.g339000). We refer to the gene products as CrRbcX-IIa and CrRbcX-IIb, respectively. Note that in the most recent genome annotation CrRbcX-IIa would start at amino-acid residue 34 and lacks the sequence encoding the transit peptide. CrRbcX-IIb, on the other hand, has a putative transit peptide but the annotated gene codes for a protein twice the length of other RbcX homologs (~290 residues) with only the first 160 amino acids displaying homology to RbcX proteins (Fig 1). The additional sequence in CrRbcX-IIb probably represents an intron, and thus the sequence for CrRbcX-IIb is apparently incorrectly annotated. We focused our analysis on CrRbcX-IIa, which was previously annotated with a putative transit peptide. Based on sequence alignment with the mature form of A. thaliana RbcX-II (also known as AtRbcX1), which begins with Lys46 [19], we cloned CrRbcX-IIa starting at Arg33 (Fig 1), generating a protein of~17 kDa. CrRbcX-IIa(33-189) was recombinantly expressed and purified from the soluble fraction. Analysis by native-MS showed that CrRbcX-IIa is a dimer in solution, as expected (Fig 2A).
Full-length CrRbcX-IIa failed to crystallize. A stable fragment comprising residues 34-156 lacking the flexible C-terminal 33 residues was produced by subtilisin treatment, as determined by mass spectrometry (MS). An unstructured C-terminus was also found to be present in the cyanobacterial Syn7002-RbcX and was not required for function in Rubisco assembly [12]. We recombinantly expressed and purified the truncated CrRbcX-IIa(34-156) protein for further structural analysis. The structure of the selenomethionine (SeMet)-labeled CrRbcX-IIa(34-156) protein was solved by selenium-single-wavelength anomalous dispersion (Se-SAD) at 2.0 Å resolution. The experimental electron density was readily interpretable (Fig 3A). The structural model was built against data to 1.6 Å resolution and refined to final R and R free values of 0.177 and 0.206, respectively (see Table 1 for data collection and refinement statistics). The asymmetric unit of the monoclinic unit cell contains four copies of CrRbcX-IIa(34-156) in a two-fold symmetric topology (Fig 3B). Each chain consists of a succession of five α-helices. In three of the subunits the insertion after helix α1, residues 73-77, is disordered. This insertion is typical for RbcX-II sequences from green algae (Fig 1). Apart from the N-terminal 10 residues (see below), the backbones of the CrRbcX-IIa(34-156) subunits are closely similar (r. m.s.d. of Cα positions of 0.267 to 0.577 Å). The subunits form arch-shaped, two-fold symmetric dimers with a hydrophobic cleft in the center (Fig 4A), similar to other known RbcX structures [12,17,18]. In each subunit helices α1-α4 form a four-helix bundle, which associates with helix α5 of the opposing subunit in the dimer (Fig 4A). The N-terminal sequence of one subunit binds into the central cleft, with residues Met34 and Ile36 reaching into hydrophobic pockets located between the anti-parallel helices α1 and α1' at the bottom of the cleft (Fig 4B). The N-terminal ammonium group of Met34 engages in a tight salt bridge (lengths 2.53 and 2.58 Å) with Asp90 from the opposing dimer, which presumably stabilizes the tetramer arrangement. The other N-terminal peptide inserts into a cleft between neighboring tetramers in the crystal lattice. The dimers in the asymmetric unit interact substantially (1370 Å 2 accessible surface area buried on each dimer). Indeed, CrRbcX-IIa(34-156) formed mainly tetramers in solution as detected by native-MS (Fig 2B). However, this interaction is unlikely to be functionally relevant since full-length CrRbcX-IIa behaved as a dimer in solution (Fig 2A).

Comparison with Other RbcX Structures
The crystal structure of the dimer of CrRbcX-IIa(34-156) is closely similar to that of the plant ortholog AtRbcX-II (AtRbcX1) [18] (Fig 5A). 175 Cα positions could be superposed with a r. m.s.d. of 1.239 Å. In contrast, the structure of CrRbcX-IIa(34-156) differs more substantially from the structures of cyanobacterial RbcX and AtRbcX-I. For example, while one four-helix bundle and the associated C-terminal helix from the other subunit of the dimer of AtRbcX-I are reasonably well superimposable with CrRbcX-IIa(34-156) (r.m.s.d. 1.414 Å for 120 matching Cα atom positions), the other helical bundle is markedly shifted (Fig 5B). The situation is closely similar when comparing with the cyanobacterial Anabaena sp. CA RbcX (Ana-CA-RbcX) with an r.m.s.d. 1.453 Å for 134 matching Cα atom positions (Fig 5C). The rearrangement displaces helices α1 and α1' in the protomers longitudinally, which moves the symmetry-related pairs of hydrophobic pockets apart by~5 Å. This becomes apparent from comparing the positions of residues Leu57 and Phe62, which line the hydrophobic pockets (spheres in Fig 5). Consequently, a pseudo-symmetrical binding of the FEF motif in the RbcL C-terminal peptide across the dyad axis is not possible in CrRbcX, in contrast to the binding mode of the FEF motif to cyanobacterial RbcX [10,12]. The helices α2 of CrRbcX-IIa(34-156), which form the "walls" of the hydrophobic cleft, are rotated outwards in comparison to cyanobacterial RbcX (Fig 5C), widening the cleft.

Structural Basis of RbcL Peptide Recognition
Attempts to obtain a co-crystal between CrRbcXIIa(34-159) and the C-terminal RbcL peptide failed, presumably due to low peptide binding affinity. Taking advantage of the finding that the N-terminus of RbcX binds into the central cleft (Fig 4), we therefore generated a fusion construct between CrRbcX-IIa and the C-terminal recognition motif in CrRbcL. In this construct, residues 462-473 of CrRbcL (sequence WKEIKFEFDTID) are directly linked to residue Pro37 at the N-terminus of CrRbcX-IIa(37-156), with the new N-terminus of the fusion protein starting with Trp462 of the RbcL sequence. This fusion protein readily crystallized and the structure was solved at 1.97 Å resolution ( Table 1). The structural core of CrRbcX-IIa  in the fusion protein is virtually identical to that obtained for CrRbcX-IIa(34-156) (r.m.s.d. 0.425 Å for 211 matching Cα positions). Thus it is unlikely that the contact area with the RbcL peptide is distorted by crystal packing. Difference electron density along the hydrophobic cleft could be assigned to the RbcL residues 462-467 (WKEIKF). Residues 468-473 (EFDTID) of RbcL as well as residues 37-43 of CrRbcX-IIa were disordered (Fig 6A). Notably, Phe469 was among the disordered residues, consistent with the finding that the corresponding Phe464 in Syn7002-RbcL is functionally less important for RbcX binding than Phe462 (Phe467 in CrRbcL) [12]. The sidechains of Ile465 and Phe467 point into hydrophobic pockets surrounded by Phe60/Arg64/Leu67/Leu92 and Leu57/Phe60/Met96, respectively (Fig 6B). The sidechain of Lys463 points towards the C-terminal end of helix α2 and Asp90. The indole moiety of Trp462 interacts with Tyr85 and Met89, but also with a neighboring CrRbcX-IIa molecule (not shown), and thus these interactions seem to be influenced by crystal packing.
Superposition with the structure of the heterologous cyanobacterial Syn6301-RbcL 8 /Ana-CA-RbcX 8 assembly intermediate [10] shows that Ile465 and Phe467 of CrRbcL are recognized by similar sites on CrRbcX-IIa (Fig 6C). The peptide is oriented more towards helix α2 in the cyanobacterial structure, whereas it assumes a deeper and more central position in the hydrophobic cleft of CrRbcX-IIa (Fig 6C). The indole ring of Trp462 is at roughly the same place in the superposition, but the backbone conformations differ strongly at this segment. We note that in the context of the RbcL subunit this residue would be connected, whereas it forms the N-terminal residue in the fusion construct. This difference in sequence topology may influence the binding mode.
The superposed CrRbcX-IIa is compatible with the surface of the RbcL anti-parallel dimer in the context of the RbcL 8 core complex (Fig 6D), in a topology similar to that observed for the cyanobacterial RbcX [10]. The C-terminal sequence of one RbcL subunit reaches into the central cleft of CrRbcX-IIa and the functionally critical, conserved residues Gln69 and Arg118   (Fig 1) are positioned correctly for interaction with the second RbcL subunit (Fig 6D). The loop insertion between helices α1 and α2 of CrRbcX-IIa, which is ordered in the structure of the CrRbcL(462-474)-RbcX-IIa(37-156) fusion protein, would extend into a shallow groove of the RbcL dimer surface (Fig 6D). We speculate that this loop insertion found in RbcX sequences of green algae might modulate the interaction with RbcL.

Functional Characterization of CrRbcX
We used the previously reconstituted Rubisco from S. elongatus PCC6301 [9] and the bacterial chaperonin system GroEL/ES to assess the functionality of CrRbcX-IIa in Rubisco assembly. Unfolded RbcL was bound to GroEL upon dilution from denaturant. Assembly was initiated by adding GroES, ATP and RbcX for 60 min at 25°C, followed by addition of RbcS for Rubisco activity assay. The formation of holoenzyme was dependent on RbcX as shown previously [9], reaching a yield of~20% with AnaCA-RbcX (Fig 7A). Addition of the C-terminal RbcL peptide prior to RbcS doubled the yield to~40% by facilitating the displacement of RbcX from the RbcL 8 RbcX 8 assembly intermediate by RbcS [9,10]. A lower yield of enzyme activity of~7% was obtained with full-length CrRbcX-IIa(33-189) protein, but only when present at a high molar excess (30 μM dimer) over RbcL. Again the activity doubled in the presence of the C-terminal RbcL peptide (Fig 7A). The mutant CrRbcX-IIa(R118A) did not support assembly, consistent with this conserved residue being involved in the stabilization of the RbcL dimer [9,10,12]. Notably, CrRbcX-IIa(34-189), lacking the N-terminal residue Arg33 of the full-length protein was inactive (Fig 7A). Arg or Lys is conserved at this position among most RbcX-II homologs (Fig 1). Arg33 is also missing in the C-terminally truncated, crystallized CrRbcX-IIa  protein. In the crystal structure, the amino group of the N-terminal Met34 forms a short salt bridge (2.53-2.58 Å) with Asp90 from the other RbcX dimer, which appears to stabilize the tetramer. In addition, Arg33 would clash with the other dimer, consistent with the MS data showing that deletion of Arg33 favors tetramer formation in solution (Fig 2). We suggest that in the absence of Arg33, the N-terminus of RbcX may bind into the central cleft, rendering the protein non-functional in Rubisco assembly (Fig 7A), consistent with the formation of non-functional tetramers (Fig 7B).

Discussion
Our data demonstrate that RbcX-II from the green algae C. reinhardtii functions as a bona fide Rubisco assembly chaperone, despite its considerable evolutionary distance from cyanobacterial and eukaryotic RbcX-I proteins. Like all other known RbcX proteins, CrRbcX-IIa is an arcshaped dimer with a central hydrophobic cleft that binds the C-terminal sequence of the RbcL subunit. Conserved polar residues at the corners of RbcX make critical contacts to the Ndomain of an adjacent RbcL, thereby stabilizing the RbcL anti-parallel dimer in a state competent for assembly to the RbcL 8 core complex of Rubisco.
The crystal structure of CrRbcX-IIa differs from the structures of cyanobacterial RbcX homologs in several aspects. The adjacent helices α1, which form the floor of the hydrophobic cleft, are shifted with respect to each other, moving the binding pockets for the two Phe sidechains in the C-terminal RbcL binding motif apart. Consistently, density for the bound peptide sequence is only discernible until the first Phe residue (Phe467) in the complex structure. There are additional hydrophobic cavities between the helices close to the symmetry axis, resulting from the conserved substitution of Thr10 in cyanobacterial RbcX by Ala in RbcX-II (residue 50 in CrRbcX-IIa sequence numbering), but these volumes are not occupied in the complex with peptide. In the apo-structure, the sidechains of the conserved residues Met34 and Ile36 point into these pockets, but the functional significance of this interaction, if any, is unclear. Interestingly, in the structure of the A. thaliana ortholog, which has essentially the same backbone conformation, the pockets are smaller and intra-molecular binding of the Nterminus into the central cleft is not observed.
Besides the RbcX homologs, a recent screen of a Maize mutant library identified several additional Rubisco accumulation factors, including Bsd2, Raf1 and Raf2 [38][39][40][41][42]. RbcX and Raf1 are generally conserved in photosynthetic organisms containing form IB Rubisco [2,3], but mediate assembly by different mechanisms [43]. Whether RbcX and Raf1 cooperate in a coherent assembly pathway or act in parallel pathways is still unknown.