Rhabdovirus Matrix Protein Structures Reveal a Novel Mode of Self-Association

The matrix (M) proteins of rhabdoviruses are multifunctional proteins essential for virus maturation and budding that also regulate the expression of viral and host proteins. We have solved the structures of M from the vesicular stomatitis virus serotype New Jersey (genus: Vesiculovirus) and from Lagos bat virus (genus: Lyssavirus), revealing that both share a common fold despite sharing no identifiable sequence homology. Strikingly, in both structures a stretch of residues from the otherwise-disordered N terminus of a crystallographically adjacent molecule is observed binding to a hydrophobic cavity on the surface of the protein, thereby forming non-covalent linear polymers of M in the crystals. While the overall topology of the interaction is conserved between the two structures, the molecular details of the interactions are completely different. The observed interactions provide a compelling model for the flexible self-assembly of the matrix protein during virion morphogenesis and may also modulate interactions with host proteins.


Introduction
Rhabdoviruses are single-stranded RNA viruses that possess non-segmented negative-sense genomes encoding five open reading frames and form enveloped, bullet-shaped virions [1]. Dimarhabdoviruses, the supergroup of rhabdoviruses that infect mammals and mosquitoes [2], are of considerable economic and social importance. Members of the genus Lyssavirus such as rabies virus cause lethal meningoencephalitis in humans and animals [3] while vesicular stomatitis virus (VSV; genus Vesiculovirus) causes symptoms clinically identical to those of foot-and-mouth disease in cattle and occasional, limited infections in humans [1].
The rhabdovirus matrix (M) protein is small (,20-25 kDa) but plays a number of roles during the replication cycle of the virus. The M protein is an important structural component of rhabdovirus virions, forming a layer between the glycoprotein-(G-) containing outer membrane and the nucleocapsid core composed of the virus nucleoprotein (N), polymerase (L), phosphoprotein (P) and RNA genome [4][5][6]. M condenses the nucleocapsid core into the 'skeletons' seen in mature virions [7,8], recent evidence suggesting that M does so in association with preformed nucleocapsid-G plasma membrane microdomains [9]. M aggregates in vitro [10][11][12] to form long fibers [11], the N terminus of the protein and a region between residues 121-124 being important for this self-association [11,13]. In addition to being spread through the cytosol of infected cells, M is targeted to mitochondria [14,15], to nuclei [16], and to plasma membranes [9,17]. M has been shown to interact directly with negativelycharged membranes [17][18][19] and can induce their deformation [20]. This interaction is mediated primarily by the N terminus of M, which contains several positively-charged amino acid residues, although in VSV residues 121-124 may also be involved [19][20][21][22].
In addition to its structural roles, M has been implicated in controlling the balance between transcription and replication of the viral genome [21,23], in promoting budding [24,25], and in modulating host-cell transcription [26], -translation [27] andapoptosis [28][29][30][31]. The specific protein:protein interactions that mediate a number of these functions have been identified. A 'late domain' (sequence PPXY) located toward the N terminus of M promotes budding by interacting with the WW-domain of NEDD4, a ubiquitin ligase that interacts with the vesicle formation and cargo sorting ESCRT complexes, although the precise mechanism by which this would facilitate budding remains unclear [24,25,32,33]. VSV M has been shown to inhibit the production of host proteins by binding directly to Rae1 and blocking the export of host mRNA from the nucleus, residues near the N terminus of M being essential for this interaction [27,34]. M also regulates the translation of host mRNA by binding to and/or modulating the phosphorylation state of translation initiation factors [26,35,36], as well as inducing the production of viral proteins through an unknown mechanism [21].
To date, the only structural information available on rhabdovirus M proteins is the structure of a thermolysin-stable M core (M th ) of VSV Indiana (VSV Ind ) [19]. The proteolytic treatment removed the N-terminal 47 residues and cleaved the surfaceexposed hydrophobic loop between residues 121-124, and only residues 58-121 and 128-227 were visible in the structure. To further investigate the role of the N terminus and hydrophobic surface-exposed loop and to investigate the structural conservation of M across Rhabdoviridae, we solved the structures of full-length M from VSV serotype New Jersey (VSV NJ ) and from the lyssavirus Lagos bat virus (LBV). These structures reveal that rhabdovirus M proteins share a similar overall fold and self-associate via a stretch of amino acids (in the otherwise-disordered N terminus) that bind to a similar region on the globular domain, although the molecular details of the interaction interface differ dramatically between VSV NJ and LBV M. This inter-molecular interaction provides a plausible mechanism for the self-assembly of M, leading to enhanced affinity for membranes. Further, the differences in this interaction provide a structural framework for understanding the distinct cytopathic effects of vesiculoviruses and lyssaviruses.

Structure of M from VSV serotype New Jersey (VSV NJ )
The structure of VSV NJ M was solved by SeMet singlewavelength anomalous dispersion phasing and refined to 1.83 Å resolution with residuals R/R free = 0.157/0.179 (Table 1). VSV NJ M forms a globular domain with a central a helix sandwiched on one side by an extensive 5-stranded b sheet and on the other by two a helices and a smaller two-stranded b sheet ( Figure 1A) that is very similar to the thermolysin-resistant core of M (M th ) from VSV serotype Indiana (VSV Ind ), with 0.9 Å root-mean-squared displacement (rmsd) over 156 C a atoms. Residues 121-128, disordered in the VSV Ind M th structure, are observed in the structure of VSV NJ M with residues 121-124 forming a short stretch of a helix (a2.5, Figure 1A). The N-terminal 57 residues of VSV NJ M do not form part of this globular domain. While residues 1-40 and 53-57 are not well ordered and could not be modeled in electron density, residues 41-52 are located in strong electron density bound in a deep hydrophobic pocket formed by the loops between sheet b1 and helix a1, the region between helices a2 and a2.5 (including sheet b2), and the stretch of 3-10 helix immediately preceding helix a3.
F46 is central to the interaction: the side chain of F46 sits deep in the hydrophobic pocket lined by the side chains of residues Y81, V84, L116, Y131, Y197 and the backbone between residues 114 and 116 ( Figure 2). The backbone of F46 forms hydrogen bonds with the backbone of F78 and with a water molecule that bridges the backbone of F78 and side chain of Y131. Residues 45-51 of the bound ligand interact with residues flanking the deep pocket into which the side chain of F46 binds. F45 sits in a shallow hydrophobic pocket lined by the hydrophobic side of the backbone peptide planes between residues 77-79 and by the side chains of P77 and R79. The backbone of G47 forms H-bonds with the backbone of A118 and with a solvent atom that bridges the backbones of residues G47, A118 and M51. M48 sits in a shallow groove formed by the side chains of P77, A118, V122 and Y131. E49 forms a hydrogen bond with the side chain of Q117 and with a water molecule that also hydrogen bonds with the side chain of Y197. The backbone of D50 forms a hydrogen bond with the side chain of Q117. M51 sits in a shallow hydrophobic pocket formed by the hydrophobic side chains of A118, P120, V122, L123 and the hydrophobic peptide plane between residues 118 and 120. Overall, these interactions bury 1050 Å 2 of surface area. The colocalization of strong anomalous scattering with the two SeMet residues (48 and 51) allowed unambiguous identification of the bound peptide ( Figure S1A) and SDS-PAGE analysis confirmed that the crystallized VSV NJ M was intact ( Figure S2A).
Residue 52 of the bound peptide is 46 Å from the first ordered residue of the globular domain to which it binds (S58), a distance too great to be spanned by the missing 5 residues. Distance constraints dictate that this bound peptide derives from an adjacent molecule in the crystal, related by the crystallographic symmetry operator [2x2K, 2y2K, z2K] with residue 58 lying 11 Å from residue 52 of the bound peptide (C a -C a distance). While electron density linking residue 52 to residue 58 of the adjacent monomer is observed in maps calculated using data to 4 Å resolution, these residues could not be modeled because in higher resolution maps this density is significantly reduced, presumably due to disorder. The interaction of the globular domain with the N-terminal peptide of an adjacent molecule in the crystal gives rise to non-covalently linked linear polymers of VSV NJ M monomers ( Figure 3A).
Asides from the bound N terminus, the most striking structural difference between VSV NJ M and VSV Ind M th in the globular domain is in the orientation of residues 191-202, which form part of helix a3 and of the loop that precedes it ( Figure 4). In VSV NJ M this loop has shifted toward the interacting N-terminal residues: the C a atom of K196 moves 9.5 Å from its position in VSV Ind M th and residues 195-200 form a stretch of 3-10 helix. Y197 on this helix forms part of the deep hydrophobic pocket in which F46 resides, in addition to forming a water-mediated H-bonds with the side chain of E49 ( Figure 4).
Alignment of vesiculovirus M sequences ( Figure 5) shows that F46 and the residues that form the deep binding pocket in which it sits are highly conserved amongst VSV serotypes, although there are some conservative substitutions of flanking residues. In viruses Isfahan, Piry, Alagoa, Cocal and Chandipura the core F46 residue of the N-terminal interacting motif is conserved, but the flanking residues are significantly changed. However, side chains that form the hydrophobic pocket in which F46 is buried are conserved (Y131, F78) or conservatively substituted (Y81F/C, V84A, L116M). In spring viremia of carp virus, a dimarhabdovirus [2] not assigned to the vesiculovirus genus that infects fish rather than

Author Summary
Rhabdoviruses are of considerable socioeconomic importance. For example, rabies virus causes lethal encephalitis resulting in approximately 50,000 human deaths per year. Rhabdoviruses infect cells and propagate despite having small genomes that encode only five multifunctional proteins. One of these, the matrix protein, plays a structural role in virus assembly in addition to modulating the production of host and virus proteins, promoting viral egress from the host cell and modulating cell death. We have solved the 3-dimensional crystal structures of matrix proteins from two distantly related rhabdoviruses: Lagos bat virus and vesicular stomatitis virus. The two proteins have very similar structures despite having dissimilar amino acid sequences. Surprisingly, for both we observe self-association between a pocket on the main globular domain and one extremity of an adjacent molecule in the crystal. Repetition of this interaction gives rise to noncovalent polymers of matrix proteins, adjacent proteins being tethered by a flexible linker. This provides a compelling molecular mechanism for the self-association of matrix molecules required for virus assembly. While the general mode of polymerization is conserved between the two structures, the precise molecular details of the interactions differ, consistent with these matrix proteins binding different cellular factors during infection. mammals, residue F46 is not conserved and it is unclear whether M from this virus would be able to self-associate in a manner similar to that observed for VSV NJ M.

Structure of M from Lagos bat virus (LBV)
The structure of LBV M was solved by SeMet multi-wavelength anomalous dispersion phasing and refined to 2.75 Å resolution with residuals R/R free = 0.207/0.255 (Table 1). Residues 48-202 of LBV M form a globular domain with an overall fold that closely resembles those of VSV Ind M th and VSV NJ M, with 2.8 and 3.1 Å root-mean-squared deviation between 138 and 139 equivalent C a positions, respectively, despite the aligned residues sharing less than 10% sequence identity ( Figure 1). While the central a helix and back 5-stranded b sheet overlay very well, the loop between sheets b2 and b3 is much shorter in LBV M than in VSV NJ M and no stretch of helix is present between these sheets. Helices a1 and a3, sheets b6 and b7 and the loop between sheets b4 and b5 are also shifted between the LBV M and VSV NJ M structures.
Strikingly, in the structure of LBV M a stretch of peptide is again observed bound in a shallow hydrophobic groove formed by the b1-a1 and b2-b3 loops of the globular domain ( Figure 1B). Anomalous difference density co-located with the Se atom of SeMet33 in the SeMet-labelled protein unambiguously identifies the bound peptide as LBV M residues 30-37 ( Figure S1B), no electron density being evident for residues 1-29 or 38-47. This interaction is centred on residues 32-36, which form a short stretch of left-handed polyproline-II helix ( Figure 2B). The backbone of residues 33-35 packs against the backbone of the b1-a1 loop of the globular domain, M33 forming two hydrogen bonds with Y67. W112 in the b2-b3 loop sits under residues 33-35, its indole nitrogen forming a hydrogen bond with the carbonyl oxygen of P34 ( Figure 6). The side chain of P36 sits in a hydrophobic pocket formed by the P107 and W112 side chains and the backbone of M110 and N111 ( Figure 2B). Overall, these interactions bury 830 Å 2 of surface area. SDS-PAGE confirmed that the crystals contained full-length LBV M ( Figure S2B) and gel filtration analyses of full-length and truncated (D1-45) LBV M were consistent with the N-terminal portion of M adopting an extended/disordered conformation in solution ( Figure S3) as has been observed previously for VSV M [13]. As in VSV NJ , the distance between the last residue of the bound peptide and the first residue of the LBV M globular domain is too great to be spanned by the intervening residues (49 Å from E37 C a to E48 C a ), and the bound peptide must come from a neighbouring monomer. The interaction with the most likely monomer, related by the crystallographic symmetry operator [1+x2y, 12y, 12z] (23 Å from E37 C a to E48 C a ; Figure S4), generates linear polymers of non-covalently linked molecules in the crystal ( Figure 3B).
The sequence of the interacting N-terminal region of LBV M is conserved in lyssaviruses, the only exception being the conserva- tive substitution of M33 with leucine ( Figure 7). The residues of the globular domain to which they bind are also conserved, with just two non-disruptive exceptions (M110L in strain SADB-19 and N111S in strain ZAMRAV51, Figure 7).

Discussion
The M proteins of VSV NJ and LBV self-associate in a similar but distinct manner In the structures of both VSV NJ and LBV M, a peptide from the otherwise-disordered N terminus of the protein is observed bound to the globular domain near the b1-a1 and b2-b3 loops ( Figures 1A & 1B). The overall location of this interacting region in the sequence of the proteins is similar, the interacting residues being less than 20 residues from the start of the globular domain ( Figure 1C), and in both the interaction is between adjacent molecules, thereby forming non-covalently linked linear polymers of M in crystallo (Figure 3). It is therefore particularly striking that the nature of the interfaces formed by the N-terminal interacting residues and the globular domains differ so significantly ( Figure 2). The similar overall nature of the self-association, despite large differences in the molecular details of the interaction interfaces, is compelling evidence that the self-interaction is biologically relevant rather than being an artefact of crystallization.
The self-association of VSV NJ M is centered on F46, which binds into a deep hydrophobic pocket on the surface of the globular domain. Residues 45-47 adopt an extended conformation, flanked on either side by single turns of a helix. To the best of our knowledge this self-association interface shares no homology with previously identified protein:protein interaction interfaces. In contrast, the peptide recognition cleft of LBV M is quite shallow and residues 33-36 (sequence MPPP), which bind into this hydrophobic pocket on LBV M, form a short stretch of polyproline-II helix. The recognition of polyproline-II helices formed by proline-rich motifs (PRMs) is an important theme in protein:protein interactions. Six classes of PRM-binding domains have previously been described: SH3, WW, EVH1, UEV and GYF domains and profilins [37]. These are generally characterized by the presence of a central tryptophan residue, around which the polyproline helix wraps, with shallow hydrophobic pockets accommodating the proline side chains [37]. LBV M exhibits this generic mode of binding ( Figure 6), but details of the interaction differ from the known classes of PRM interactions. The structure of LBV M therefore reveals both a novel 7 th family of PRM-

The role of M self-association in rhabdovirus assembly
The M proteins of rhabdoviruses play important roles in virus assembly. They condense the nucleocapsid cores into a tightlycoiled nucleocapsid-M complex (termed 'skeletons') [7,8], form a layer between the nucleocapsid and the surrounding lipid bilayer [4][5][6], and promote virus budding [6,24,[38][39][40]. An obvious functional implication of the self-association in the structures of VSV NJ and LBV M is in virus assembly, by facilitating long-range organization of M molecules and thereby enhancing the local concentration of M.
Experiments investigating the self-association of VSV M support this hypothesis. M assembly is a two-stage process involving the sequential addition of M monomers to small pre-    formed M nuclei to form fibres [10,11]. The N-terminal portion of M plays a critical role in the second step, polymerisation [11,13]. Treatment of M with trypsin gives rise to a stable fragment (M t ) spanning residues 44-229 [41] that retains the ability to form fibres but can only nucleate M aggregation in the presence of added Zn 2+ [10,11]. However, M treated with therymolysin (M th ), comprising residues 48-121 and (122, 123 or 124)-229, does not aggregate [13,19]. This is consistent with removal in M th of F46, the residue central to the interaction, and with the 'untethering' of the b2-b3 loop, which forms part of the interaction surface on the globular domain (Figures 2 & S5). We propose that the association between the N-terminal portion of M and the globular domain of an adjacent M molecule observed in our structures is the same as that which promotes addition of M monomers to pre-formed nuclei to yield large M fibers in vitro [11] and presumably promotes virus assembly in vivo. While it is possible that the movement of the loop that links sheet b5 to helix a3 in VSV NJ M relative to VSV Ind M ( Figure 4) represents a conformational switch that facilitates nucleation or polymerization, the mechanism by which such a switch would be induced remains unclear.
Mutational analysis of the b2-b3 loop supports a role for the observed interaction between the N-terminal portion of M and the globular domain of an adjacent molecule in virus morphogenesis. Substitution of VSV Ind residues 121-124 (sequence AVLA) with DKQQ gives rise to a mutant M protein that shows reduced capacity to recruit free M or M DKQQ into pre-formed nucleocapsid-M DKQQ complexes, although the general ability to selfassociate is maintained [21]. Mapping this mutation onto the structure of VSV NJ M ( Figure S5) reveals that the mutated residues surround the site of interaction of the N-terminal peptide, but they do not interfere with the burial of F46 into the deep hydrophobic pocket and would thus presumably not abolish the interaction entirely. A double mutation of M48 and M51 to arginine in VSV NJ yields a virus phenotype competent for assembly but unable to inhibit host-cell gene expression [42][43][44]. The side chain of M48 is not required for self-association, since in VSV Ind it is replaced with valine ( Figure 5). Further, while both M48 and M51 form part of the VSV NJ self-association interface, neither is completely buried (Figures 2 & S5) and the observed interaction could most likely be maintained in the mutated M protein with little energetic penalty.
Mutation of residues 35 and 36 in the N-terminal interacting region of lyssavirus M to serine and alanine, respectively, reduces viral fitness [45]. This is consistent with our proposed model, although further experiments are required to distinguish mutations that modulate self-association from those which interrupt the interaction of the PPXY 'late domain' with the host-cell budding machinery.
Assembly of rhabdoviruses may require the M protein to interact with a circular or helical scaffold formed by the nucleocapsid [46][47][48][49]. As the non-covalent polymers of M observed in the crystal lattices are straight, the relative orientations of adjacent globular domains observed in the crystals might not reflect the packing of globular domains in the final assembled virion. However, the flexible nature of the tether that links one globular domain to the next, via interaction with the flexible Nterminal segment, could accommodate major rearrangements of adjacent globular domains. This would allow the required curvature of the M polymers and facilitate reorientation of M to interact with other components of the virion. Based on the dimensions of 'shaved' VSV virus particles and number of M molecules in such particles [50], assuming a model where M lies immediately below the plasma membrane, the mean distance between the centres of adjacent M proteins would be ,45 Å . This is significantly larger than the M-to-M distances observed in the non-covalent linear polymers formed within the crystals (35 Å for VSV NJ and 28 Å for LBV, Figure 3) and is consistent with a loose tethering of M proteins in the assembled virions. Such flexibility would allow for higher concentrations of M at points of higher membrane curvature, as has been observed recently for VSV M [9]. A similar beads-on-a-string arrangement in virions has been postulated for the influenza virus matrix protein [51].

Self-association enhances the affinity of VSV M for membranes
The self-association of M identified in our structure informs previous experiments on the association of VSV M with membranes. M associates with membranes both in vitro [18][19][20] and in vivo [9,22]. M is thought to link the nucleocapsid and the envelope of the virus [4][5][6], although recent evidence suggests that M might actually be recruited to pre-formed nucleocapsid-G plasma membrane microdomains [9]. M t , in which the N-terminal 43 residues are removed by trypsin proteolytic cleavage, maintains its ability to interact with membranes [19,20]. The interaction is significantly weaker than for wild-type M, confirming that the positive-charge of the lysine-rich N terminus is important for membrane association, but the fact that some association is maintained suggests the presence of other, potentially weaker membrane-interaction interfaces on M [22]. In contrast, M th is almost entirely unable to interact with membranes [19,20]. M th has only 4 residues fewer at the N terminus than M t [13]. Since none of these are positively-charged, the decrease in affinity can't be due to a loss of charge-mediated affinity for membranes. Furthermore, substitution of the hydrophobic side chains in the surface loop cleaved by thermolysin (residues 121-124) for charged residues does not abolish membrane association [21], indicating that an additional membrane-attachment interface has not been excised by the thermolysin treatment. We propose that the observed decrease in membrane affinity arises instead from the inability of M th to self-associate. Polymerisation at the membrane, mediated by the self-association observed in our structure, would provide avidity enhancement of binding thus overcoming the lack of N-terminal charge in M t .

Interaction of M with cellular proteins
In addition to their role in virus assembly and budding, rhabdovirus M proteins are important for subverting the host immune response by suppressing the production of host genes. It has previously been observed that VSV M blocks host gene translation by binding directly and specifically to Rae1, a protein involved in nuclear export of mRNA [34]. Substitution in M of residues 52-54 with alanine completely abolishes this interaction [27,34]. A second substitution in this area, M51R has the same effect, although a direct loss of interaction with Rae1 has not been shown for that mutant [52]. Our structure reveals that the Rae-1 binding site on VSV M partially overlaps with the N-terminal selfassociation motif (Figure 1). Steric considerations make it likely that self-association of VSV M and Rae-1 binding would be mutually exclusive.
A direct interaction between a cellular protein and a sequence overlapping the N-terminal portion of M involved in selfassociation has also been observed in lyssaviruses. In this case the interaction is between the 'late domain' (sequence PPEY, residues [35][36][37][38] and the WW domain of NEDD4 [24], a ubiquitin ligase that interacts with proteins in the ESCRT pathway and promotes virus budding (Figure 1) [25]. The polyproline-II helix conformation adopted by residues 32-36 of LBV M is entirely compatible with binding of this 'late domain' to the NEDD4 WW domain. As above, steric clashes would prevent simultaneous interaction of these residues with WW domains and with the globular domain of LBV M. Since PRMs and their binding motifs are such a common theme in cellular protein:protein interactions it is likely that both the PRM motif and PRM-binding groove of LBV M also mediate specific interactions with other cellular proteins, although such binding partners are yet to be identified. Such an interaction between the self-association pocket on the globular domain of M and an unknown cellular protein has recently been identified for VSV. While mutation of the VSV Ind M b2-b3 loop residues 121-124 to DKQQ (M DKQQ ) interferes only modestly with the self-association of M (see above), it produces a marked reduction in the amount of viral mRNA translated in infected cells [21]. As this phenotype can be rescued by coinfection with wild-type VSV it probably arises from a loss of function rather than a gain of inhibition. This phenotype was mapped specifically to the b2-b3 loop, reversion of residues 121-122 to the wild-type sequence (AV) restoring wild-type levels of viral mRNA translation [21]. It is likely that the yet-unidentified factor required to promote efficient viral mRNA translation binds in a manner similar to that observed for the N-terminal segment in our structure. As discussed above, V122 forms part of the binding pocket for M48 (V48 in VSV Ind ). While mutation of V122 to lysine doesn't significantly impair self-association of VSV Ind it is possible that the effect would be greater on ligands with a larger hydrophobic residue in a position equivalent to residue 48. It is equally likely that V122 would be more buried in an interaction with this (unknown) cellular partner, reducing its ability to 'swing away' from the binding cleft.
To summarise, both VSV and lyssavirus M have known cellular interaction motifs that overlap with the self-interaction motifs revealed by the structures presented here. However, the molecular details of the self-interaction motifs differ significantly and it is likely that the cellular binding partners of VSV and LBV M proteins are also distinct, consistent with the different host-range and cytopathogenicity of vesiculoviruses and lyssaviruses. The region between the b1-a1 and b2-b3 loops and the fragments of the otherwise-disordered N-terminal tails to which they bind are clearly hot-spots for rhabdovirus M protein:protein interactions. This suggests a tempting evolutionary hypothesis to explain the similarity in overall interaction topology but difference in molecular interfaces between vesiculoviruses and lyssaviruses. The self-association grooves on the globular surfaces of the proteins and their cognate ligands might have evolved to mimic desirable protein:protein interactions within the host cells, the functional constraint imposed by needing to remain competent for self-assembly (and thus enable viral morphogenesis) having maintained the overall topology of the interaction. The maintenance of interaction with cellular partners and self-association at the same locus on the protein raises a second interesting possibility, that the observed self-associations also play a role in regulating interaction of M with cellular partners by (partially) sequestering the binding interfaces, as has been suggested for the HIV M protein [53]. This would provide a raison d'être for the shortened M gene product observed in VSV (residues 51-229) [43]; it possesses the deep hydrophobic peptide-binding groove but not the Nterminal peptide, thereby providing the virus with a pool of M protein (with unoccupied binding grooves on their globular surfaces) able to interact with cellular binding partners.

Cloning, expression, purification, crystallization and data collection
The matrix (M) protein from Lagos bat virus (LBV) was cloned, expressed, purified and diffraction data were collected as described previously [54]. M from VSV serotype New Jersey (VSV NJ ) was cloned into pOPINS, encoding an N-terminal His 6 -SUMO fusion tag, and selenomethionine-labelled (SeMet) protein was expressed and purified as described for LBV M [54]. Purified VSV NJ M was concentrated to 1.2 mg/mL and crystallization trials were attempted at 20.5uC in sitting drops containing 100 nL protein and 100 nL precipitant solution equilibrated against 95 mL reservoirs in 96-well plates. Crystals of SeMet VSV NJ M grew in 20% v/v isopropanol, 20% w/v PEG 4000 and 0.1 M sodium citrate (pH 5.6) and were cryoprotected by a quick pass through reservoir solution supplemented with 20% v/v glycerol before flash cryocooling in a cold (100 K) stream of nitrogen gas. Diffraction data were recorded from a single crystal of SeMet VSV NJ M at a wavelength of 0.9803 Å , to maximize the selenium anomalous signal, on ESRF beamline ID23EH1. Diffraction data were processed using XDS [55] and SCALA [56] as implemented by the xia2 automated data processing package (Winter et al., in preparation).

Structure solution and refinement
The structures of VSV NJ and LBV M were solved at 1.83 Å and 3.0 Å resolution by single-and multiple-wavelength anomalous dispersion analysis using the diffraction data described above and elsewhere [54], respectively. For both, selenium atoms were located and their positions refined using SHELXD [57] and SHARP [58] as implemented by autoSHARP [59]. For VSV NJ M, GenBank accession IDs for all sequences are shown and the sequence used for structure determination is in bold typeface. Residues that are highly or moderately conserved (BLOSUM62 scoring) are colored marine and light blue, respectively. The [M/L]PPP proline-rich motif is colored purple and residues that interact with this motif are in purple typeface. Secondary structure is shown above the sequences (b-sheets, a-helices and polyproline-II helices are shown as arrows, cylinders and triangular prisms, respectively). doi:10.1371/journal.ppat.1000251.g007 electron density maps were solvent-flatted using SOLOMON [60] and DM [61]. The structure was traced by manually placing VSV Ind M th (PDB ID 1LG7; [19]) into electron density and subsequently rebuilding in COOT [62]. For LBV M, the experimental map was solvent flattened and an initial partial model traced using cycles of automatic building in RESOLVE and restrained refinement in REFMAC5 [63][64][65] followed by manual rebuilding in COOT. The initial LBV M model was placed into the high-resolution (2.75 Å ) native data by rigid-body refinement in REFMAC5. Final TLS+restrained refinement of both structures was performed in REFMAC5 [66], earlier refinement of LBV M having been performed using BUSTER/TNT [67]. The MolProbity server [68] and the validation tools present in COOT informed the refinement of both structures. Refinement statistics are shown in Table 1 and final refined coordinates and structure factors have been deposited with the PDB with accession IDs 2w2r (VSV NJ M) and 2w2s (LBV M).

Structure analysis
Superpositions and structure-based alignment of VSV and LBV M were performed using SSM [69] and MUSTANG [70]. Vesiculovirus and lyssavirus M protein sequences were aligned using MUSCLE [71] and sequence alignment figures were produced with the assistance of JalView [72] and Inkscape (http://www.inkscape.org). Molecular graphics were produced using PyMOL (DeLano Scientific LLC). Crystals were prepared for SDS-PAGE analysis by removing all mother-liquor surrounding the crystals using a fine paper wick (Hampton Research), washing the crystals in situ with 0.3 mL reservoir solution, wicking away the reservoir solution, and then dissolving the washed crystals in 0.6 mL 8 M urea. Dissolved crystals were diluted to 5 mL in ultra-pure water, 2 mL of 46 SDS-PAGE loading buffer was added (Invitrogen) and the sample heated to 95uC for 5 min before being loaded onto a 10% w/v polyacrylamide NuPAGE gel that was run in MES buffer according to the manufacturer's instructions (Invitrogen). Protein bands were visualized with SafeStain (Invitrogen). is consistent with its mass, full-length LBV M elutes considerably later than expected for a monomer (but earlier than expected for a dimer). This is consistent with the first 45 residues of LBV M adopting an extended/disordered conformation in solution. D 1-45 LBV M was cloned from reverse-transcribed LBV genomic DNA [54] into pOPINF (thereby adding an N-terminal His 6 affinity tag and 3C cleavage site for removal of said tag) by InFusion ligationindependent cloning [74] using the PCR primers 59-AAGTTCTGTTTCAGGGCCCGGGCAAAGAGAATGTTAG-AAACTTTTGTATAAATGG-39 (forward) and 59-ATGGTCTA-GAAAGCTTTATTCCAACAGAAGTGAAGTGTTCTCATC-TTC-39 (reverse). D 1-45 LBV M was expressed in E. coli Rosetta(DE3)pLysS using auto-induction medium as described [75]. Lysis and initial Ni-NTA purification was as described for SUMO-tagged, full-length LBV M [54]. The eluate was diluted to reduce the imidazole concentration to 33 mM using gel filtration buffer (25 mM Hepes pH 8.0, 100 mM NaCl, 5 mM DTT, 0.1 mM ZnCl 2 ) and then treated with 200 mg of 3C protease overnight at 4uC [76]. Following cleavage, 2 mL of Ni-NTA Sepharose beads (GE Healthcare) were added to the mixture, incubated for a further hour on ice and applied to a disposable chromatography column (Econopak, Bio-Rad). The flow-through was collected, concentrated and applied to a Superdex 75 column (HiLoad 16/60, GE Healthcare) equilibrated in gel filtration buffer. Peak fractions were pooled, concentrated and the protein identity verified by mass spectroscopy. Full-length and D 1-45 LBV M were concentrated to ,0.3 mg/mL as estimated by A 280 and theoretical extinction coefficients using 5 kDa molecular weight cut-off micro-concentrators (Vivascience). Samples (100 or 150 mL) were applied to a Superdex S75 10/300 GL gel filtration column (GE Healthcare) pre-equilibrated in 25 mM HEPES pH 8.0, 100 mM NaCl, 5 mM DTT, 0.1 mM ZnCl 2 and eluted at 0.5 mL/min. The column was calibrated using the molecular size standards conalbumin (75 kDa), ovalbumin (43 kDa), carbonic anhydrase (29 kDa), ribonuclease A (13.7 kDa) and aprotinin (6.5 kDa) from the low molecular weight gel filtration calibration kit (GE Healthcare). K av was calculated using the equation K av = (V E 2V 0 )/(V C 2V 0 ) where V E is the elution volume in mL, V 0 is the void volume in mL as measured by the elution of blue dextran, and V C is the geometric volume of the column (24 mL). The elution profiles of full-length and D 1-45 LBV M were not significantly perturbed by increasing the concentration of NaCl to 500 mM, nor was the elution profile of full-length LBV M perturbed by the absence of 0.1 mM ZnCl 2 or by the presence of 0.1 mM EDTA (not shown). Found at: doi:10.1371/journal.ppat.1000251.s003 (0.08 MB PNG) Figure S4 Stereogram of LBV M packing in the crystal. Residues 31-37 (violet), which interact with the globular domain of LBV M (green) may come from one of three molecules, related by the following symmetry operators: [1+x2y, 12y, 12z] (red; 22.6 Å from C a Glu37 to C a Glu48), [1+y, 12x+y, 21/6+z] (orange; 32.4 Å from C a Glu37 to C a Glu48), [12x+y, 12x, 21/ 3+y] (blue; 35.5 Å from C a Glu37 to C a Glu48). The red molecule is the only one that makes additional crystal contacts with the globular domain, this interaction burying 970 Å 2 of surface area. Distances between the C a atom of residue 37 of the bound polyproline motif and the residue 48 C a atoms of the symmetryrelated molecules are shown as dotted lines. For clarity only selected symmetry-related molecules are shown (grey). (A) and (B) represent two orthogonal views. Found at: doi:10.1371/journal.ppat.1000251.s004 (4.30 MB PNG) Figure S5 The interaction between the b2-b3 loop of VSV NJ M and the bound peptide. Residues of the globular domain (carbon atoms green) and the N-terminal interacting residues (carbon atoms pink) are shown as sticks, and the molecular surface of the globular domain is shown in white, highlighting that mutation of AVLA (121-124) with DKQQ would not disrupt hydrophobic pocket that binds F46. The side chains of A121 and A124 point away from the bound peptide. While the side chain of L123 forms part of the shallow hydrophobic cleft into which M51 binds, the L123Q mutation would not prohibit peptide binding as there is ample room for the surface-exposed side chain to move away; indeed it is possible that the hydrophobic face of the glutamine side chain amide might replace the hydrophobic leucine side chain and form part of the binding pocket. The substitution of V122, which interacts with M48 of the bound peptide, would equally not preclude binding. In VSV Ind , the strain for which the MDKQQ mutant was generated, M48 is replaced with the shorter hydrophobic amino acid valine. The presence of this shorter side chain on the peptide would allow sufficient space for V122 to be replaced by lysine without severely disrupting binding.