Solution NMR Structure and Histone Binding of the PHD Domain of Human MLL5

Mixed Lineage Leukemia 5 (MLL5) is a histone methyltransferase that plays a key role in hematopoiesis, spermatogenesis and cell cycle progression. In addition to its catalytic domain, MLL5 contains a PHD finger domain, a protein module that is often involved in binding to the N-terminus of histone H3. Here we report the NMR solution structure of the MLL5 PHD domain showing a variant of the canonical PHD fold that combines conserved H3 binding features from several classes of other PHD domains (including an aromatic cage) along with a novel C-terminal α-helix, not previously seen. We further demonstrate that the PHD domain binds with similar affinity to histone H3 tail peptides di- and tri-methylated at lysine 4 (H3K4me2 and H3K4me3), the former being the putative product of the MLL5 catalytic reaction. This work establishes the PHD domain of MLL5 as a bone fide ‘reader’ domain of H3K4 methyl marks suggesting that it may guide the spreading or further methylation of this site on chromatin.


Introduction
Post translational modifications of histones are a key epigenetic mechanism used to regulate gene transcription, chromatin condensation, DNA damage sensing and repair. Key among these modifications are protein lysine acetylation and methylation. These modifications are ''written'' or ''erased'' by chromatinassociated proteins that have the specific catalytic activities. These modifications are in turn recognized by ''reader'' domain(s) of proteins that are recruited to the chromatin. Better known examples of reader domains include chromodomain [1,2], bromodomain [3], MBT domain [4], TUDOR domain [5], WD40 domain [6], PWWP [7], and PHD finger [8,9,10].
PHD (Plant HomeoDomain) fingers are small modules with conserved cysteines and histidine coordinating 2 zinc ions in a canonical Cys4-His-Cys3 mode. Based on the Pfam protein family classification, the PHD finger is found in over 100 proteins in the human genome. Proteins with PHD fingers are mostly nuclear [10] and often involved in chromatin remodelling. PHD fingers studied so far recognize several different histone trimethyllysine marks [11,12] as well as unmodified histone H3 N-terminus [13,14], and possibly acetyllysine [15].
Mixed Lineage Leukemia 5 (MLL5) is a SET domain methyltransferase and contains a single PHD finger followed by a catalytic SET domain. MLL5 protein localizes to distinct nuclear foci, but this activity was not affected by deletion of either the PHD domain or the SET domain [16]. Overexpression of MLL5 prevented cell cycle progression into S phase by associating with cell cycle regulatory elements impairing its activity [16]. Phosphorylation of the C-terminus of the SET domain of MLL5 is required for mitotic progression, suggesting a role for histone methylation [17]. Immunoprecipitation and in-vitro pull down experiments showed that MLL5 interacts with borealin, a subunit of the chromosome passenger complex, stabilizing the complex [18]. MLL5 is also reported to bind with tetrameric p53 via p53's DNA binding domain [19]. MLL5 is a component of a complex associated with retinoic acid receptor that requires GlcNAcylation of its SET domain in order to activate its histone lysine methyltransferase activity [20]. Knockout mice studies showed that murine MLL5 is required in normal hematopoiesis [21,22,23] as well as maturation of spermatozoa [24]. However, except for nuclear foci formation, the role of the PHD domain in these activities has not been delineated.
We report the solution NMR structure of the PHD domain of MLL5 and confirm its binding to histone H3 peptides di-and trimethylated at lysine 4 (H3K4me2/3). Importantly, the latter, but not the former is thought to be the product of the methyltrans-ferase activity of the MLL5 [25]. We propose a binding mechanism based on the newly determined structure and its comparison with other PHD domains combined with biophysical interaction data with histone peptides. This data supports the growing observation that many histone modifying enzymes have evolved specialized 'reader' domains that recognize the reaction product of their catalytic domains, which may help with spreading of the respective histone mark along chromatin, and/or the further methylation of this mark by a separate methyltransferase.

Results and Discussion
Solution Structure of MLL5 PHD finger Using an 80 residues protein construct spanning the PHD domain of MLL5 (Ser109-Asp188) we determined its solution structure by NMR spectroscopy (Figure 1). The N-terminal region of the domain, residues 109-117, appears to be disordered in solution, while the structure of the rest of the domain (residues 118-183) is well defined with a backbone r.m.s.d of 0.86+0.17 Å (see Table 1). It comprises two small antiparallel b-strands, b1 and b2 (residues 132-134, and residues 141-143, respectively), one ahelix, a1 (residues 170-183), and three long loops stabilized by two zinc-binding clusters. Similar to the other structurally characterized PHD domains, MLL5 PHD domain binds two Zn 2+ ions in a cross-braced fashion. Zn1 atom is coordinated by Cys121, Cys123, His143, and Cys146 while Zn2 atom is coordinated by four cysteine residues Cys135, Cys138, Cys160 and Cys163, respectively (see Figures 1a and b). All Zn-coordinated residues and the key residues from the hydrophobic core are highly conserved among homologous MLL5 PHD domains ( Figure 2).
Electrostatic surface representation of the MLL5 PHD domain is shown in Figure 3. One can see an extended putative H3 peptide binding surface groove typical for PHD fingers [8]. Namely, there are two adjacent hydrophobic pockets (presumably for H3 Lys4 and H3 Ala1 binding, respectively) divided by a tryptophan (Trp141). Trp141 occupies the conserved 'position I' of the H3K4me2/3-binding aromatic cage commonly observed in PHD domains [8]. In the majority of the NMR ensemble models, His127 forms the opposite side of a minimal aromatic cage, and is complemented by Thr119 and Met132 which complete a hydrophobic pocket that is likely to bind di-or tri-methlysine ( Figure 3). During preparation of this manuscript, a crystal structure of the MLL5 PHD in complex with H3K4me3 was published and suggests that His127 is replaced by Asp128 [30].
A novel feature of the MLL5 PHD domain is a long a helix(helix a1) not present in any other published PHD structures and formed by a C-terminal sequence unique to MLL5 and its homologs (Figures 1,2,4). This helix folds onto the canonical PHD finger on the opposite side from the peptide binding groove via hydrophobic interactions involving two highly conserved residues, Ala173 and   [29]. Organism of origin is shown on the left-hand-side of each sequence. Secondary structure elements of the PHD domain are shown above its sequence for clarity (a-helix as cylinder and b-strands as arrows). The residues coordinating Zn1 and Zn2 atoms are marked with blue and black dots at the top, respectively. Homologous domains are identified using protein blast against non-redundant protein database (http://blast.ncbi. nlm.nih.gov/blast.cgi). multiple sequence alignment is performed using clustalw2 (www.ebi.ac.uk/tools/clustalw2). doi:10.1371/journal.pone.0077020.g002 Gln177 (Figure 2, 4A). The solvent exposed face of helix a1 consists of positively charge residues Lys170, Arg178, and Arg181 that are poorly conserved in the homologous MLL5 proteins (see Figure 3). This suggests that the role of helix a1 may be to act as a 'structural brace' for the PHD domain, as opposed to forming a new interaction surface on the solvent exposed face of the helix.
To determine structural homologs of the PHD domain of MLL5 we used the DALI server [31]. Many PHD domains with significant similarity (Z-score .4.0) were detected. For example, the PHD domain of human BPTF (PDB ID 3QZV, 2FSA) has 38% sequence identity with MLL5 PHD and Z-score of 4.7. The best match to MLL5 was human PHD finger protein 13 (PHF13) from (PDB ID 3O70; Z-score 5.7) which has only 28% sequence identity with MLL5 PHD . Nevertheless these two PHD domains can be structurally aligned with a backbone r.m.s.d. of 1.9 Å over 47 residues (see Figure 4a). Comparison of the putative methyl lysine binding pocket of the MLL5 PHD with that of the peptide-bound PHF13 PHD (PDB ID 3O7A) showed that MLL5 PHD is likely to bind H3K4me3 in the same manner as PHF13 PHD .

Histone Recognition of MLL5
MLL5 is reported to bind directly to chromatin at the cell cycle regulated element [32]. However, MLL5 lacks an obvious DNA binding motif. Furthermore, MLL5 has been identified as GlcNAcylation-dependent H3K4 methyltransferase component of the RARA complex [20]. Since PHD fingers are known to bind histone tails [11] and our structure shows a potential histone peptide binding pocket conserved among several complexes between PHD fingers and histone tails with differing lysine modifications have been reported in the PDB, we hypothesized that MLL5 PHD also binds methylated histone H3 tails.
We first performed an initial in-vitro peptide binding assay on his-tagged MLL5 iso1 equivalent to isoform 1 that contains both PHD and SET domains (residue 1 to 609). A mixture of biotinylated H3 peptides with various degrees of methylation at different lysine sites was incubated with purified MLL5 iso1 . MLL5 iso1 /H3 peptide complexes were pulled down using streptavidin-agarose beads and the presence of the complex detected using an anti-MLL5 antibody (Figure 5a). The streptavidin pull down assay showed that MLL5 iso1 binds to methylated H3K4 and H3K27 peptides but not to H3K9 peptides. To further deconvolute the binding to H3K4, the same experiment was repeated with biotinylated H3K4 peptides with differing degrees of     (Figure 5b). This showed that H3K4 MLL5 iso1 binds to both H3K4me2 and H3K4me3 peptides. We did not detect any binding of MLL5 iso1 to monomethylated H3K4 peptides.
To determine if the same binding mode applies to the PHD finger alone, a peptide array of different histone sequences with differing lengths and lysine/arginine modifications was synthesized. Purified his-tagged MLL5 PHD was incubated with the membrane and detected using anti-HIS antibody (supplementary Figures S1 and S2). The peptide array confirmed that MLL5 PHD consistently binds to H3K4me3. This binding was not abrogated by methylation on the R2 or R8 positions, nor with phosphorylation on the S10 position. Phosphorylation at the T3 position appeared to diminish binding to the H3K4me3 spot, as did deletion of the first 3 residues, suggesting an important contribution from residues 1-3 in the interaction.
The peptide array results did not show reproducible binding of the MLL5 PHD to any other acetyl-or methyl-lysine marks within H3 peptides, including H3K9 and H3K27. This suggests that the potential H3K27me binding activity observed for MLL5 iso1 must reside in regions of the protein other than the PHD domain. Since immobilized peptide arrays are semiquantitative at best, and prone to false positive and negative results [33], we sought to confirm these results with more quantitative analyses using free components in solution. A fluorescence anisotropy assay using H3K4 peptides labelled with fluorescein at the C-terminus enabled measurement of the equilibrium dissociation constants ( Figure 6). Consistent with our peptide array result, both H3K4me2 and H3K4me3 peptide bind to MLL5 PHD with a similar dissociation constant of ,16 uM.
The binding of H3K4me3 peptide to MLL5 PHD was further confirmed by 15 N-HSQC NMR titration revealing significant changes in the NMR spectrum of 15 N-labelled MLL5 PHD upon increasing amounts of H3K4me3 peptide (Figure 7a). The residues involved in peptide binding can be inferred from the chemical shift changes ( Figure 7b) and map to the conserved histone peptide binding region described above (Figure 8c). The residues affected the most by this binding involved Thr119, Asp128, Met132, His143, Asp145, Tyr139 and Trp141. This includes the key residues of the aromatic cage with the exception of His127, whose 15 NH resonance is not visible in the HSQC reference spectrum.
The strong chemical shift changes in the aromatic cage combined with the structural similarity between MLL5 PHD and  PHF13 PHD domains suggest a similar H3 peptide binding mode (Figure 4b, 8a). On the other hand, the surface of the putative H3R2 binding site is different between these two proteins. MLL5 PHD has a negatively charged pocket comprising a putative H3R2 binding site that is absent in the case of PHF13 PHD (see Figure 8a). The structure of the complex of the PHF13 PHD and H3K4me3 peptide shows that the side chains of H3R2 is not docked tightly to the surface of the PHD domain which is in agreement with the absence of a corresponding H3R2 binding pocket. The putative H3R2 binding pocket on the surface of MLL5 PHD domain is formed by the well conserved residues Cys134, Asp136, Tyr158, and Thr157. A very similar groove, formed with the same type of residues can be found in the structure of the complex of human AIRE PHD1 domain with the unmodified H3 peptide (Figure 8b) [13,34]. Structural alignment of MLL5 PHD and AIRE PHD1 indicate that key H3R2 binding residues of AIRE PHD1 Cys310, Asp312, Trp335, and Thr157 superimpose well with the Cys134, Asp136, Tyr158, and Thr157 of MLL5 PHD domain, respectively. These residues in MLL5 PHD show modest changes in chemical shift upon peptide titration, consistent with a modest contribution to binding affinity and a tolerance for methylation of Arg2 in binding to peptide arrays. Finally the key residues of the H3A1 binding site (Tyr158) also shows large chemical shift changes upon H3 peptide binding (see Figure 8). Taken together our structural and biochemical data support the role of MLL5 PHD as a specific 'reader' domain of H3K4me2/3 marks.

Conclusion
We have determined the solution structure of the PHD finger of MLL5 and observed very similar structural features compared to other PHD fingers. It was reported that in MLL5 knockdown cells, H3K4 methylation at the cell cycle regulated element is reduced [16], and H3K4 trimethylation levels are also reduced at E2F1 target promoters [25]. The preferential binding of the PHD domain to di-and trimethylated H3K4 is one of the most frequently observed examples of a growing number for ''reader domains'' that reside within larger enzymes or enzyme complexes that 'write' the same mark. It has been proposed that a potential role for such a function may be to help facilitate spreading of the mark along chromatin by the reader domain binding to the product of the catalytic reaction, enabling the enzyme to then modify a neighboring histone/nucleosome [35,36]. Our data suggest that the PHD domain of MLL5 may serve such a role in its modification of genomic loci with the H3K4me2/3 mark.

Protein Expression and Purification
The PHD finger of MLL5 (residue 109-188) was inserted into a pET28a-MHL vector (GenBank ID: EF456735) via ligaseindependent cloning. The recombinant protein was expressed in BL21 (DE3) Codon plus RIL (Stratagene). For proteins used for peptide array, cells were grown in rich Terrific Broth (Sigma); for proteins used for NMR structure determination, cells were grown in minimal media containing 13 C-glucose and 15 N-NH 4 Cl as the sole carbon and nitrogen source, respectively. The cells were grown at 37uC and induced with IPTG when cells reach the midlog phase of growth for another 12 hours.
For 13 C and 15 N labeled protein used for NMR studies, cells were harvested by centrifugation and resuspended in lysis buffer (10 mM tris, pH 8.5, 15 mM imidazole, 500 mM NaCl, 10 uM ZnSO 4 ). The cells were lysed by sonication and cell debris were removed by centrifugation at 12000 rpm for 20 min at 4C. The supernatant was bound to Ni-NTA beads and washed extensively with washing buffer (10 mM tris, pH 8.5, 30 mM imidazole, 500 mM NaCl, 10 uM ZnSO 4 ). Target protein was eluted with elution buffer (10 mM tris, pH 8.5, 500 mM imidazole, 500 mM NaCl, 10 uM ZnSO 4 ). After elution, benzamidine and DTT was added to a final concentration of 1 mM each.

NMR Structure Determination
NMR spectra were recorded at 25uC on Bruker Avance 600 MHz or 800 MHz spectrometers equipped with cryoprobes. All 3D spectra employed non-uniformly sampling scheme in the indirect dimensions and were reconstructed by multi-dimensional decomposition software MDDNMR [37] interfaced with MDDGUI [38] and NMRPipe [39]. The assignments of 1 H, 15 N and 13 C resonances were obtained by an ABACUS [40] approach using the following experiments: HNCO, CBCA(-CO)NH, HBHA(CO)NH, HNCA, (H)CCH-TOCSY and H(C)CH-TOCSY. Distance restraints for structure calculations were derived from cross-peaks in 15 N-edited NOESY-HSQC (tm = 100 ms), 13 C-edited aliphatic and aromatic NOESY-HSQC in H 2 O (tm = 100 ms) respectively. Peak picking was performed manually using Sparky [41]. The restraints for backbone w and y torsion angles were derived from chemical shifts of backbone atoms using TALOS [42]. Automated NOE assignment and structure calculations were performed using CYANA (version 2.1) [43]. A total of 93% of NOESY peaks were assigned after seven iterative cycles of automated structure calculation and NOE assignment. The final 20 lowest-energy structures were refined with the CNS [44] package by performing a short constrained molecular dynamics simulation in explicit solvent [45]. Resulting structures were analyzed using MOLMOL [28], PROCHEK [46], MOLProbity [47], and PSVS validation software [26]. The final refined ensemble of 20 structures and resonance assignments for MLL5 PHD domain were deposited into the Protein Data Bank (PDB ID, 2LV9) and BioMagRes DB (BMRB accession number 18559), respectively.

Streptavidin Pull Down Assay
Human MLL5 ORF v3.1 was shuttled from pDONR223 into pDEST17 which expressed N-terminal 66 His Tag fusion MLL5 protein from the T7 promoter. Purified his-tagged MLL5 iso1 was stored in buffer (30 mM imidazol, 116 mM NaCl, 25 mM Tris-HCl pH 7.5, 3 mM KCl) and then an aliquot was incubated with 0.5 mg biotinylated histone H3 peptides (residues 1-21 or 21-44, Upstate) in binding and washing buffer (25 mM Tris, pH 7.5, 120 mM NaCl, 3 mM KCl and 0.05% (v/v) Nonidet P-40) for 4 h at 4uC. Streptavidin-Sepharose 4B beads (Upstate 16-126) incubated with 66 His-hMLL5: peptide binding reactions overnight at 4uC. Beads: complexes were then washed three separate times each in 1 ml binding and washing buffer at 4uC and heated to 90uC for 7 min with NuPage Sample buffer (Invitrogen) with reducing agent. The lysates were loaded on a 4-12% NuPage gradient gel (pre-cast from Invitrogen), run at 200 V for 40 min and blotted on nitrocellulose membranes using the iBlot semi-dry transfer system (Invitrogen; program P3 for 7 min). The membranes were probed with anti-MLL5 antibodies (pAb 31994 Custom made antibody in serum from Rabbit #9762) and 1/3000 anti-rabbit-HRP and visualized using the Immobilon Western Chemiluminescent HRP System (Millipore).

Peptide Array
Peptide arrays were synthesized using Intavis. The array was blocked at 4 u C overnight with 5% skimmed milk in PBS-T (50 mM Na 3 PO 4 pH = 7.5, 110 mM NaCl, 0.05% Tween 20), and washed three times with PBS-T. For identifying the binding site of each peptide within MLL5 PHD , the MLL5 PHD protein was diluted in 1% milk in PBS-T to 1 mM. The protein was incubated with the membrane overnight at 4uC. The array was washed three times with PBS-T. Protein was detected using HRP-conjugated anti-His antibody (Novagen).

Fluorescence Anisotropy Binding Studies
Fluorescence polarization assays were performed in 384-well plates, using the Synergy 2 microplate reader from BioTek. All the peptides were synthesized and purified by Tufts University Core Services (Boston, MA, U.S.A.), with the N-terminus labeled with fluorescein. Binding assays were performed in a 10 mL volume at a constant labeled peptide concentration (40 nM), by titrating the MLL5 PHD domain (at concentrations ranging from low to high micromolar) into 20 mM Tris-HCl buffer (pH 7.5), containing 50 mM NaCl, 0.01% Triton X-100. The data points were fitted to ligand binding function using Sigma Plot software to determine the K d values. Figure S1 H3 histone tail peptide array bound with histagged MLL5 PHD . Protein was detected using anti-His antibody. Left panel showed the no protein control, only the poly-His spot was detected by the anti-His antibody. Right panel showed the peptide spots where MLL5 PHD proteins were bound. The letters on each grid highlight which residues on the H3 histone tail was modified. Actual peptide sequence on the array is shown in Supplementary Figure S2. (JPG) Figure S2 Peptide sequence of the peptide array in Figure S1. Grid location refers to Figure S1a. (JPG) Author Contributions