A Compact Viral Processing Proteinase/Ubiquitin Hydrolase from the OTU Family

Turnip yellow mosaic virus (TYMV) - a member of the alphavirus-like supergroup of viruses - serves as a model system for positive-stranded RNA virus membrane-bound replication. TYMV encodes a precursor replication polyprotein that is processed by the endoproteolytic activity of its internal cysteine proteinase domain (PRO). We recently reported that PRO is actually a multifunctional enzyme with a specific ubiquitin hydrolase (DUB) activity that contributes to viral infectivity. Here, we report the crystal structure of the 150-residue PRO. Strikingly, PRO displays no homology to other processing proteinases from positive-stranded RNA viruses, including that of alphaviruses. Instead, the closest structural homologs of PRO are DUBs from the Ovarian tumor (OTU) family. In the crystal, one molecule's C-terminus inserts into the catalytic cleft of the next, providing a view of the N-terminal product complex in replication polyprotein processing. This allows us to locate the specificity determinants of PRO for its proteinase substrates. In addition to the catalytic cleft, at the exit of which the active site is unusually pared down and solvent-exposed, a key element in molecular recognition by PRO is a lobe N-terminal to the catalytic domain. Docking models and the activities of PRO and PRO mutants in a deubiquitylating assay suggest that this N-terminal lobe is also likely involved in PRO's DUB function. Our data thus establish that DUBs can evolve to specifically hydrolyze both iso- and endopeptide bonds with different sequences. This is achieved by the use of multiple specificity determinants, as recognition of substrate patches distant from the cleavage sites allows a relaxed specificity of PRO at the sites themselves. Our results thus shed light on how such a compact protein achieves a diversity of key functions in viral genome replication and host-pathogen interaction.

Expression plasmids were transformed into Escherichia coli BL21 rosetta (DE3). For each construct, n overnight culture was used to inoculate 1-L of LB media containing 50 µg/L carbenicillin and 25 µg/L chloramphenicol. This culture was grown at 37°C to optical density (OD600) 0.6. Expression was induced by the addition of 0.5 mM isopropyl-b-Dthiogalactopyranoside (IPTG) and the cells were grown for 4 hours at 30°C. The cell pellet was harvested, frozen and stored at -20°C.
The disrupted cell lysate was centrifuged at 8000 xg during 30 min and the supernatant was loaded onto a 1 mL Ni2+-NTA agarose column (Qiagen) preequilibrated with buffer A (100 mM Tris-HCl pH 7.5, 350 mM NaCl, 25 mM imidazole, 1 mM DTT). The column was washed with 50 mL of buffer A, followed by 10 mL of washing buffer A² (100 mM Tris-HCl, pH 6.0, 350 mM NaCl, 25 mM imidazole, 1 mM DTT). The protein was eluted by the elution buffer B (100 mM Tris-Cl, pH 7.5, 350 mM NaCl, 500 mM imidazole, 1 mM DTT). The eluted PRO was further purified by high-resolution Superdex S-75 gel filtration column (Amersham) with Buffer C (10 mM Tris-HCl pH8, 350 mM Ammonium Acetate, 1mM DTT). For DUB activity assays, we pooled only those fractions that were not contaminated by the bacterial S15, yielding electrophoretically pure PRO as judged by Coomassie-stained SDS-PAGE (Fig. S6).

Crystallization.
A pool from all fractions of the gel filtration step in buffer C was concentrated to 39 mg/ml as judged by OD280 nm. This sample was thus contaminated with bacterial protein S15 [2]. Showers of needles and hexagonal crystals of up to 50 x 50 x 40 µm 3 grew in a single vapor diffusion drop where 1 µl protein solution plus 1 µl well solution (0.1M Hepes pH 7.5, 2.5 M Ammonium formate) was equilibrated against a 0.5 ml reservoir volume. Similar needles also appeared with pure PRO samples, but the hexagons were obtained only from this contaminated preparation and turned out to contain one PRO and one S15 molecule par asymmetric unit [2]. Prior to testing, crystals were transferred for ~30 s in 0.1M Hepes pH 7.5, 4 M Ammonium formate, 16% glycerol and flash frozen by plunging into liquid nitrogen. The hexagons diffracted to close to 2 Å resolution on synchrotron beamlines. We note that S15 is a much better Coomassie binder than PRO, comprising 12.4% or Arg residues ( 4.4% in PRO) [3]. The S15 contamination was thus far from stoechiometric and amounted to a few percent in mass (see Fig.1 in [2]), possibly some 10% mol/mol. Since on the other hand, S15 is less well detected at OD280nm (calculated absorbance of 0.29 cm -1 (mg/ml) -1 ) than PRO (calculated absorbance of 0.54 cm -1 (mg/ml) -1 ), the actual PRO concentration was close to that calculated by OD280 nm without taking S15 into account.

Protocol S2
Docking of ubiquitin onto the PRO structure.
Detailed procedure.
2,000 monomeric structures were generated starting from the Ub monomer extracted from the vOTU/Ub structure, maintaining the structure for residues 1-70 and sampling C-terminal tail conformations using the Rosetta 3.4 FloppyTail application [4] with standard parameters. These conformations were clustered at 0.2 Å backbone RMSD. For each of the resulting 98 clusters, a representative conformation was picked.

Other docking simulations.
Other docking simulations were performed in order to test the robustness of the binding mode obtained with the HADDOCK method. In particular, we ran docking simulations without applying any prior restraints between the putative binding regions.
First, 54,000 rigid-body conformations for the PRO/Ub interface were generated using Zdock [7]. Zrank [8] was used to rerank those structures but no likely cluster of solutions emerged. Scoring these decoys using InterEvScore, a novel scoring function combining a multi-body statistical potential with evolutionary information [9], led to identification of a well-ranked solution with 1.73 Å interface backbone RMSD to the model chosen from the HADDOCK simulations above.
We also tested the extent to which the extended shape of the C-terminal tail of Ub constrained the docking solutions. A shortened version of Ub lacking the C-terminal flexible tail (Ub1-71) was docked onto PRO using HADDOCK with a set of ambiguous restraints between the active residues (E825, I847 and F849 on PRO; L8, I44 and V70 on Ub) and the passive residues (positions 758, 759, 760 . Generation of 1,000 conformations through rigid-body docking was followed by refinement of the 200 best structures, which were then clustered. The largest cluster had an orientation of Ub precluding the insertion of the C-terminal tail in PRO's catalytic cleft. In contrast, the second largest cluster is compatible with such an insertion. The interface with the best HADDOCK score belongs to this second largest cluster and has a low interface backbone RMSD of 2.02 Å compared to the model chosen in the HADDOCK run with full-length Ub. This interface also involves interactions between the N-terminal lobe and Ile847 on PRO and the apolar patch on Ub supporting the selection of the model presented in the main text. Taken together, these results indicate a common most likely binding mode for the PRO/Ub interface, which was further confirmed by directed site mutagenesis experiments on Ile847, E759/N760 and L732/L765.  Four successive asymmetric units along the 3 1 crystallographic screw axis are shown as Cα traces. S15 molecules are in cyan and successive PRO molecules are red, green, yellow and magenta with the C-terminal serine 879 displayed as spheres. In the crystal, the parallel PRO helices are connected only through S15 dimers (not shown).

Figure S3: Two cis-prolines upstream of the catalytic histidine.
Final 2Fo-Fc electron density map for the two cis-prolines 865-Gly-Pro-Pro-867 directly upstream strand β6. Also visible and labeled is the catalytic dyad Cys783-His869. The map is displayed at 3 sigma contour as a dark gray mesh. Superimposed is the final model (PDB 4a5u) as sticks with two PRO molecules displayed, the peptidase in magenta and the substrate in green. The catalytic dyad of the peptidase and the C-terminal serine of the substrate are labeled. HADDOCK scores of the 500 refined structures docked using the HADDOCK software plotted against their interface backbone root mean square deviation (RMSD in Å) compared to the complex with the best HADDOCK score. The model selected and represented in Fig.  4CD is the complex with the lowest HADDOCK score (red filled dot) corresponding to the largest cluster and in good agreement with other docking methods (see Protocol S2).  The model is displayed and colored as in Fig. 4C. The three residues not conserved between human and plant are drawn as spheres.