Recognition and Accommodation at the Androgen Receptor Coactivator Binding Interface

Prostate cancer is a leading killer of men in the industrialized world. Underlying this disease is the aberrant action of the androgen receptor (AR). AR is distinguished from other nuclear receptors in that after hormone binding, it preferentially responds to a specialized set of coactivators bearing aromatic-rich motifs, while responding poorly to coactivators bearing the leucine-rich “NR box” motifs favored by other nuclear receptors. Under normal conditions, interactions with these AR-specific coactivators through aromatic-rich motifs underlie targeted gene transcription. However, during prostate cancer, abnormal association with such coactivators, as well as with coactivators containing canonical leucine-rich motifs, promotes disease progression. To understand the paradox of this unusual selectivity, we have derived a complete set of peptide motifs that interact with AR using phage display. Binding affinities were measured for a selected set of these peptides and their interactions with AR determined by X-ray crystallography. Structures of AR in complex with FxxLF, LxxLL, FxxLW, WxxLF, WxxVW, FxxFF, and FxxYF motifs reveal a changing surface of the AR coactivator binding interface that permits accommodation of both AR-specific aromatic-rich motifs and canonical leucine-rich motifs. Induced fit provides perfect mating of the motifs representing the known family of AR coactivators and suggests a framework for the design of AR coactivator antagonists.


Introduction
The androgen receptor (AR) is the cellular mediator of the actions of the hormone 5-a dihydrotestosterone (DHT). Androgen binding to AR leads to activation of genes involved in the development and maintenance of the male reproductive system and other tissues such as bone and muscle. However, it is the pivotal role of AR in the development and progression of prostate cancer that has led to increasing interest in this nuclear receptor. Presently, hormone-dependent prostate cancer is treated with a combination of strategies that reduce circulating levels of androgens, such as the administration of antiandrogens that compete for the androgen-binding pocket in the core of the C-terminal ligand-binding domain (LBD). The benefits of these treatments are typically transient, with later tumor growth associated with increases in expression levels of AR or its cofactors, or mutations that render AR resistant to antiandrogens (Gregory et al. 2001;Culig et al. 2002;Lee and Chang 2003). Alternative approaches to inhibiting AR transcriptional activity may therefore lie in disrupting critical protein associations the receptor needs for full function.
The precise details of how AR binds the dozens of coregulator proteins reported to associate with different regions of AR in vivo remain poorly understood (Lee and Chang 2003). Many nuclear receptors activate transcription by binding short leucine-rich sequences conforming to the sequence LxxLL (where ''x'' is any amino acid), termed nuclear receptor (NR) boxes, which are found within a variety of NR coactivators including the p160 family. Hormone binding to the LBD stabilizes the C-terminal helix of the receptor, helix 12, in a conformation that completes a binding surface for these LxxLL motifs (Darimont et al. 1998;Nolte et al. 1998;Shiau et al. 1998;Bledsoe et al. 2002). The structural elements composing this binding interface, consisting of helices 3, 4, 5, and 12 of the receptor, are synonymous with a previously defined hormone-dependent activation function that lies within the LBD termed activation function (AF)-2. Association of p160 coactivators allows the recruitment and assembly of a number of other cofactors that together modulate the state of chromatin and interactions with components of the basal transcription machinery to initiate transcription (Glass and Rosenfeld 2000).
AR, however, utilizes multiple mechanisms to activate gene transcription. Generally, AR activity is dependent on contributions from multiple transactivation functions that lie within the N-terminal domain (NTD) collectively called AF-1. Although the AR AF-2 can bind to a restricted set of LxxLL motifs (Ding et al. 1998;He et al. 1999;Needham et al. 2000) and is relatively potent , it usually displays weak independent activity at typical androgen-regulated genes, with significant activity observed only in the presence of high levels of p160 coactivators, as detected in some prostate cancers (He et al. 1999;Gregory et al. 2001). Instead, the AR AF-2 exhibits a distinct preference among NRs for phenylalanine-rich motifs conforming to the sequence FxxLF (He et al. 2000;He and Wilson 2003). Such motifs have been identified in the AR NTD and in an AR cognate family of coactivators that includes AR-associated protein (ARA) 54, ARA55, and ARA70 (He et al. 2000(He et al. , 2002bLee and Chang 2003). The NTD FxxLF motif (residues 23-27) mediates a direct, interdomain, ligand-dependent interaction between the NTD and LBD (N/C interaction) that is thought to facilitate dimerization, stabilize androgen binding, and possibly regulate AF-1 and AF-2 activity (Langley et al. 1998;He et al. 2000). In addition, the NTD also contains a related hydrophobic motif, WxxLF (residues 433-437), that nucleates formation of an alternative N/C interaction that may serve to inhibit AR activity (He et al. 2000(He et al. , 2002aHsu et al. 2003).
Presently, how the AR AF-2 surface can accommodate residues with bulky aromatic side chains and distinguish FxxLF motifs from LxxLL motifs is not known. To understand the structural basis of this unusual coactivator recognition preference, we characterized the full repertoire of interacting sequences using phage display to define amino acids preferred at the AR coactivator binding interface. Crystal structures of the AR LBD in complex with several phage display-derived peptides reveal the structural basis of FxxLF motif specificity and an induced fit of the receptor that allows accommodation of other related hydrophobic motifs. Comparisons of the structures suggest strategies for the design of AR coactivator antagonists.

AR Preference for Aromatic Groups in Coregulator Recognition
Phage display has been used to study coactivator recognition specificity and to identify coactivator motif sequence variants preferred by the estrogen receptor (ER), thyroid hormone receptor (TR) b, and most recently AR Norris et al. 1999;Paige et al. 1999;Northrop et al. 2000;Hsu et al. 2003). Using phage display, we screened more than 2 3 10 10 randomized peptides against DHT-bound AR LBD. Selections identified sequences containing hydrophobic motifs that were primarily aromatic in character, consistent with another recent study (Hsu et al. 2003) (Figure 1). Of these aromatic motifs, FxxLF and related motifs with substitutions of phenylalanine or tryptophan for leucine at positions þ1, þ5, or both, dominated the selections. (Peptide residues are numbered in reference to the first hydrophobic residue of the core motif, which is numbered þ1. Residues preceding the first hydrophobic residue are numbered negatively in descending order starting with À1.) Substitutions of tyrosine at the þ5 position were also observed, but to a much lesser extent (unpublished data). At the þ4 position, valines, methionines, and even the aromatic residues phenylalanine and tyrosine were observed ( Figure 1; unpublished data). In general, LxxLL motifs were not selected. The LxxLL motif shown in Figure 1 was derived from prior phage selections with ER and subsequently demonstrated to bind AR in FRETbased screens in vitro (unpublished data).
Preliminary characterization of the subset of AR-interacting peptides shown in Figure 1 confirmed that each competed for binding of in vitro translated AR cofactors to bacterially expressed AR LBD in pulldown assays, and generally did so with modestly improved efficiency relative to the native FxxLF motif from the AR NTD and significantly greater efficiency than a native LxxLL motif from glucocorticoid receptor-interacting protein 1 (GRIP1) NR box 3 (P. Webb, personal communication). The equilibrium dissociation constants (K d ) were directly determined for the interaction between the AR LBD and FxxLF and LxxLL peptides and one variant tryptophan-containing peptide, FxxLW, using surface plasmon resonance (Table 1). The K d for FxxLF was 1.1 lM, similar to the affinities of physiologically derived FxxLF motifs determined previously by isothermal titration calorimetry (He and Wilson 2003). The affinity of LxxLL was less than 2-fold weaker, with a K d of 1.8 lM, but more than three times stronger than the tightest binding p160-derived LxxLL motif, NR box 3 of transcriptional intermediary factor 2 (TIF2) (He and Wilson 2003). Surprisingly, the affinity of FxxLW, with a K d of 920 nM, was slightly better than FxxLF, in spite of the presence of the tryptophan residue at the þ5 position. Together, our results are consistent with the notion that the phage display peptides interact with the same AR surface that binds FxxLF and LxxLL motifs in native cofactors, and that they do so with similar or improved affinities relative to their natural counterparts.

One Site Fits All
To understand the binding mode of different AR coactivators, we determined the crystal structures of DHT-bound AR LBD without peptide and in complex with each of the seven peptides listed in Figure 1. All complexes crystallized in the space group P2 1 2 1 2 1 with one molecule per asymmetric unit and unit cell dimensions similar to those observed in previous AR LBD crystal structures (Matias et al. 2000;Sack et al. 2001). Overall structural features of the complexes are shown in Figure 2. Peptides assumed short a helical conformations centered on the core hydrophobic motif and bound in a solvent channel relatively free of crystal contacts on a groove formed by helices 3, 4, 5, and 12 of the receptor ( Figure 2A). Detailed data collection and refinement statistics, as well as buried surface areas for each complex, are listed in Table 2. The structures confirm previous suggestions that AR utilizes a single binding interface for LxxLL and noncanonical aromatic-rich motifs (He et al. 2000(He et al. , 2002a. Only side chains move to accommodate the array of peptides, sometimes considerably, with the unbranched side chains of Lys720, Met734, and Met894 making the largest conformational changes upon binding of peptide ( Figure 2B).

FxxLF
The mechanisms that permit AR to accommodate motifs with bulky phenylalanine residues were assessed in a crystal structure of the AR LBD in complex with the FxxLF peptide. The FxxLF peptide recapitulates the binding mode of p160derived LxxLL motifs to other nuclear receptors (Darimont et al. 1998;Nolte et al. 1998;Shiau et al. 1998;Bledsoe et al. 2002). The peptide forms a short a helix whose hydrophobic face, composed of Pheþ1, Leuþ4, and Pheþ5, binds an L-shaped groove formed by helices 3, 4, 5, and 12 of the LBD that is composed of three subsites that accommodate each hydrophobic residue (Figures 2A and 3A). The conserved charged residues at either end of the cleft, Lys720 and Glu897, the socalled charge clamp residues, make electrostatic interactions with the main chain atoms at the ends of the peptide helix: Lys720 with the carbonyl group of Pheþ5, and Glu897 with the amide nitrogens of Pheþ1 and ArgÀ1 ( Figure 3C). Glu897 also interacts with the side chain of ArgÀ1. The two interior Surface plasmon resonance data were best fit using the two-state conformational change model (Warnmark et al. 2001(Warnmark et al. , 2002. Dissociation constants were calculated from rate constants as described previously (Warnmark et al. 2001 residues of the motif, Gluþ2 and Serþ3, are solvent exposed and do not interact with the receptor. Comparison of AR alone and AR in complex with FxxLF (and other aromatic-rich peptides described below) reveals that the AF-2 cleft reorganizes to accommodate the bulky peptide side chains (see Figures 2B and 4). The unbranched side chains of Lys720 and Met734 move from an extended conformation over the þ5 pocket to one almost perpendicular to the surface of the protein. The pockets for Pheþ1 and Pheþ5 are arranged in a line, forming a deep, extended cleft on the LBD spanning the length of the two side chains on the face of the peptide helix (see Figures 3A and 4B). Pheþ1, almost entirely solvent inaccessible, binds face down at the base of this groove, making hydrophobic contacts with Leu712, Val716, Met734, Gln738, Met894, and Ile898, which define the þ1 pocket. The top of the groove, composed of Val716, Lys720, Phe725, Ile737, Val730, Gln733, and Met734, narrows to form the þ5 pocket. Met734 and the aliphatic portion of Lys720 constrict this subsite, forming van der Waals interactions with opposite faces of the Pheþ5 benzyl ring. Together, the þ1 and þ5 residues are almost entirely solvent inaccessible. In contrast, Leuþ4 binds in a shallow hydrophobic patch consisting of Leu712 and Val716 lined at the ridges by Val713 and Met894 and is largely solvent exposed.

LxxLL
The preference of AR for motifs with aromatic groups over leucine-rich motifs was assessed with a crystal structure of the AR LBD in complex with the LxxLL peptide. The structure reveals similarities between the binding modes of the LxxLL and FxxLF motifs to AR, and other LxxLL motifs to other nuclear receptors. The LxxLL motif adopts a helical conformation, and interactions of the motif with the AF-2 cleft are predominantly hydrophobic, with the three leucine residues of the motif contributing most of the interactions. However, significant differences can be seen between the binding mode of the LxxLL motif to AR and that of p160derived LxxLL motifs to other nuclear receptors. First, flanking residues were largely disordered, with only two Nterminal flanking residues and one C-terminal residue visible in electron density maps (see Figures 1 and 3B). This contrasts with extended structures seen in the p160-derived LxxLL motifs in complex with their cognate receptors (Darimont et al. 1998;Nolte et al. 1998;Shiau et al. 1998;Bledsoe et al. 2002). Second, the LxxLL peptide backbone forms hydrogen bonds with only one of the two conserved charge clamp residues, Lys720. A shift in the position of the LxxLL peptide helix precludes direct interactions with Glu897 (see Figures  2A and 3D). This shift results from changes in the geometry of the þ1 and þ5 subsites mediated by Met734, which moves 2.5Å toward the þ1 pocket (see Figures 2B and 4C) and enables binding of a leucine at the þ5 subsite by a simultaneous widening and shallowing of the pocket. This movement of Met734 causes displacement of the þ1 residue, resulting in a rotation of the peptide helix away from helix 12, toward helix 3. A slight translation of the peptide helix also occurs away  from helix 12 because of the shorter side chain length of leucine (see Figure 2A). Side chains of residues flanking the first leucine of the motif make additional hydrophobic interactions with the AR surface (see Figure 3B). Trpþ2 reaches over Met734, clamping the methionine in between itself and Leuþ1. LeuÀ1 extends over Met894, abutted against Glu893. These interactions likely explain the moderate affinity of AR for this particular LxxLL motif despite suboptimal complimentarity with the residues of the core motif (as discussed below) and the loss of main chain interactions with Glu897.

WxxLF, FxxLW, and WxxVW
To understand how the AR AF-2 accommodates tryptophan residues, structures of AR in complex with peptides containing tryptophan substitutions at the þ1 or þ5 position, or both, were determined ( Figure 5). Surprisingly, WxxLF, analogous to the only tryptophan-containing motif known in vivo, WHTLF in the AR NTD, was relatively disordered, with the peptide displaying the highest B-factor and least well defined density, suggesting that it binds with the lowest affinity (Table  2). Nonetheless, each of the tryptophan peptides adopted similar helical conformations. As described above for the LxxLL motif, substitutions at the þ1 and þ5 positions for nonphenylalanine residues result in shifts of the peptide helix (see Figure 2A). Consequently, backbone interactions with Lys720 are maintained, but interactions with the other charge clamp residue, Glu897, are lost. Once again, however, flanking residues within the peptide make additional contacts with the AR surface, and, unlike the LxxLL peptide, these contacts include Glu897. In FxxLW and WxxVW, the À2 serine ( Figure  6) forms a bidentate hydrogen-bonding interaction, making hydrogen bonds to both Glu897 and the backbone amide group of the þ2 residue. SerÀ2 of WxxLF similarly interacts with Glu897, but is too distant for helical-capping interactions with the þ2 amide group. Instead, Glu893, in a more typical interaction with the þ1 amide nitrogen, caps the WxxLF helix ( Figure 6B). Thus, tryptophan substitutions are tolerated, but they induce a shift in the peptide backbone that precludes interactions with one of the charge clamp residues. This suboptimal interaction is compensated partially by interactions of flanking residues with the AR surface.

FxxFF and FxxYF
Finally, effects of substitutions at the þ4 position were assessed in structures of AR in complex with peptides  containing FxxFF and FxxYF motifs (Figure 7). Surprisingly, the binding mode of FxxFF to AR resembled that of the tryptophan peptides more closely than the binding mode of FxxLF (see Figures 2A and 7B). Like the tryptophan peptides, interactions with Glu897 are mediated by SerÀ2 instead of the peptide backbone (see Figure 6D). Deviations from ideal helical geometry allow Pheþ4 to bind facedown in the þ4 pocket with the benzyl ring stacked against Val713.
By contrast, the conformation of FxxYF was the closest to FxxLF (see Figure 2A). Other than FxxLF, only FxxYF makes direct backbone interactions with Glu897. Unlike the facedown orientation of Pheþ4 observed in the FxxFF peptide, Tyrþ4 is bound edgewise into the shallow þ4 pocket, making interactions with Val713, Val716, and the aliphatic portion of Lys717. FxxYF was the most ordered of all the peptides, with 12 out of 15 residues observed in the electron density (see Figures 1 and 7A). Significant interactions were observed involving residues other than hydrophobic residues of the motif. Lysþ2 and Metþ6 are predominantly solvent exposed, extending out over the protein surface. Metþ6 is bound on top of Pheþ5, while Lysþ2 makes a water-mediated hydrogen bond with Asp731. ThrÀ3 of the peptide defines a new subsite, with the hydroxyl group forming a hydrogen bond to Gln738 and the methyl group making hydrophobic contacts in a pocket formed by Glu897, Ile898, and Val901. Similar interactions were observed in the glucocorticoid receptor (GR)-TIF2 complex involving the À3 glutamine of the TIF2 NR box 3 motif (Bledsoe et al. 2002). However a valine to asparagine substitution at the residue corresponding to 901 in AR creates a pocket with a more polar character in GR (Figure 8).

Restrictions of the Three Subsites
Together, the structures described above permit an assessment of the way that individual subsites of the AR AF-2 cleft   accommodate hydrophobic groups. The indole rings of tryptophan and the phenyl rings of phenylalanine fit into their pockets analogously with the þ1 and þ5 residues bound facedown and edgewise, respectively, into the AF-2 cleft. On the other hand, the position of the þ4 residue is variable, with binding in this shallow pocket largely dictated by the position of the peptide backbone caused by the bound conformations of the þ1 and þ5 residues (see Figure 2C). Small shifts in the position of the N-terminal of helix 12 can be seen, which reposition Met894 for more optimal contacts with þ4 residues bound at that subsite (see Figure 2B).
The binding mode detected in the þ1 pocket is the most conserved of the three hydrophobic subsites (see Figure 2C). The benzyl moiety of the indole side chains superimpose with the corresponding benzyl side chains of the phenylalaninerich motifs, effectively mimicking interactions of a phenylalanine residue. However, the presence of a hydrogenbonding partner on the indole side chain enables an additional polar interaction not seen in the phenylalaninerich motifs between the indole nitrogen and Gln738 (see Figure 5B). Unexpectedly, this additional interaction in the þ1 pocket does not occur with Trpþ1 of WxxVW (see Figure  5C). While similarly distanced to make the same interaction, the plane of the indole ring is rotated about 208 relative to that of WxxLF, causing it to be at a poor angle for strong hydrogen bonding to Gln738.
Binding of tryptophans in the þ5 pocket is slightly more variable (see Figure 2C). Trpþ5 of WxxVW is bound similarly to phenylalanine residues at the same position. Only the sixmembered ring of the indole group is fully buried in the pocket. The five-membered ring of the indole side chain sticks out, solvent exposed. In contrast, the þ5 indole group of FxxLW is rotated almost 908, resulting in burial of both rings of the indole group, as well as the formation of a strong hydrogen bond between the indole nitrogen and Gln730 (see Figure 5A). Binding in this orientation appears to be highly favorable, as the FxxLW peptide deviates from helical geometry at the þ5 position to do so.

Discussion
The crystal structures reported here reveal how AR binds coactivator motifs with bulky aromatic hydrophobic groups and permit construction of a profile of the AR coregulator interface (see Figure 2). In some ways, this interface resembles those of other nuclear receptors: it is an L-shaped hydrophobic cleft comprised of three distinct subsites that bind hydrophobic groups at the þ1, þ4, and þ5 positions in cognate peptides. Moreover, the so-called charge clamp residues (Lys720 and Glu897) bracket the cleft. Nonetheless, the AR coregulator recognition site is unique in that it rearranges upon motif binding to form a long, deep, and narrow groove that accommodates aromatic residues at the þ1 and þ5 positions (Figure 9). Sequence alignments of AR with other NRs suggest that a unique combination of substitutions at Val730, Met734, and Ile737 combine to permit the formation of a smoother, flatter interaction surface that displays a higher complimentarily to aromatic substituents than to branched aliphatic (see Figure 8). Of these, methionine, the only unbranched hydrophobic amino acid and the most accommodating, at a key position between the þ1 and þ5 sites, allows the AR AF-2 interface to vary the size and shape of its pockets to associate with a more diverse set of coregulators. GR also contains a methionine residue at this position, raising the possibility that it may also employ induced fit to broaden motif recognition. While naturally occurring mutations in AR have yet to be observed at Met734, it is interesting to note that mutations at Val730 and Ile737 have been reported in patients with prostate cancer and androgen insensitivity, respectively (Newmark et al. 1992;Quigley et al. 1995;Gottlieb et al. 1998).
The same characteristics that make the AR AF-2 ideal for binding of longer, aromatic side chains also make it less well suited for binding of shorter, branched side chains. Although changes in the position of Met734 widen the groove towards the þ5 subsite to permit binding of leucine residues, the gross features of the groove remain largely the same (see Figure 9B). As a result, the þ1 and þ5 leucines bind in a smooth, elongated groove and interactions between the þ1 and þ5 residues on the face of the peptide helix, or with a hydrophobic ''bump'' present in other receptors caused by a isoleucine to leucine substitution between the þ1 and þ5 subsites, are absent. Thus, a smaller proportion of the available surface area is available for van der Waals interactions.
Unlike the conserved interaction modes of aromatic residues with the þ1 and þ5 sites, binding interactions at the þ4 site are variable and characterized by nonspecific interactions. This finding agrees with the relatively high conservation of residues at the þ1 and þ5 positions of ARinteracting motifs and suggests that these residues drive peptide interaction with the LBD, whereas the þ4 site is less critical. Indeed, the þ4 pocket is shallow, surface exposed, and relatively featureless, explaining the assortment of residues selected at the þ4 position. It is likely that any hydrophobic residue that does not clash with surrounding residues would be suitable at this subsite.
While peptide motif recognition is governed by hydrophobic interactions, polar interactions from backbone atoms and residues outside the core motif also contribute. With the exception of FxxFF, motifs containing phenylalanines at the þ1 and þ5 positions present canonical main chain interactions with both charge clamp residues, Lys720 and Glu897. This finding stands in contrast to predictions of previous studies (Alen et al. 1999;He et al. 1999;Slagsvold et al. 2000;He and Wilson 2003), which concluded that Lys720 was dispensable for FxxLF binding and that Glu897 was required for binding to FxxLF and LxxLL motifs. Lys720 comprises a significant portion of the þ5 subsite, making important van der Waals interactions with the Pheþ5 benzyl group in addition to hydrogen bonds to the motif backbone. These results suggest that Lys720 is required for binding of FxxLF motifs. However, it may be that enough binding energy is provided by the other residues of the þ5 subsite (i.e., Met734), as well as by the other subsites themselves, such that removal of Lys720 would have little effect on binding. Observations that Lys720 plays a greater role in LxxLL motif binding are likely due to the fact that there is less surface area contributing to van der Waals contacts in LxxLL motifs. Disrupting binding contributions from Lys720 would thus have a more detrimental effect on binding.
On the other hand, Glu897 interacts with the FxxLF peptide backbone, but is disengaged from the LxxLL peptide backbone. One possible explanation for the apparent requirement for Glu897 in LxxLL binding is that it might interact with residues outside of the core motif. The corresponding glutamate of GR, Glu 755, forms hydrogen bonds with the À3 asparagine of TIF2 NR box 3 (Bledsoe et al. 2002), and Glu897 of AR participates in noncanonical interactions with the hydroxyl group of a SerÀ2 residue that was selected in all of our tryptophan-containing peptides. This is especially intriguing given that the only WxxLF motif known in vivo, located in the AR NTD, also possesses a SerÀ2 residue. WxxLF also makes backbone interactions with an alternate charge clamp residue, Glu893, pointing towards adaptability in AR AF-2 charge clamp formation.
Sequence alignment of NR coactivator sequences shows that positively charged residues are favored N-terminal to the core hydrophobic motif while negatively charged residues are favored C-terminal to the motif (He and Wilson 2003). Our phage-selected peptides are consistent with this trend. Arginines and lysines were observed at the N-terminal À1 position in all peptides, except for LxxLL, in which Arg was present at the À3 position. Moreover, four out of seven peptides contained negatively charged aspartate or glutamate residues C-terminal to the core motif. While previous studies have shown that complementary interactions between charged residues flanking coactivator signature motifs of coactivators and charged residues surrounding the AF-2 cleft modulated binding to the receptor (He and Wilson 2003), we find that the flanking charged residues are typically disordered in the electron density, with only ArgÀ1 of FxxLF interacting with Glu897, and Lysþ2 of FxxYF forming a water-mediated Figure 9. Surface Complimentarity of Hydrophobic Motifs in the AR, ERa, and GR AF-2 Clefts (A) AR-FxxLF, (B)AR-LxxLL, (C) ERa-GRIP1 (LxxLL) (Shiau et al. 1998), and (D) GR-TIF2 (LxxLL) (Bledsoe et al. 2002). The inside surfaces of the AF-2 cleft in AR, ERa, and GR are depicted. The LBD is additionally shown as a Ca trace with key side chains shown as white sticks. Phenylalanines and leucines of the FxxLF and LxxLL motifs are shown as spheres. DOI: 10.1371/journal.pbio.0020274.g009 hydrogen bond to Asp731. Thus, if charge-charge interactions between flanking peptide residues and the AR surface occur, they are too weak to be detected crystallographically.
Finally, the AR AF-2 surface is an attractive target for pharmaceutical design. Selective peptide inhibitors that bind the AF-2 surface of liganded ERa, ERb, and TRb have been developed (Geistlinger and Guy 2003), and similar a-helixmediated protein-protein interfaces have successfully been targeted with tight binding small molecule inhibitors (Asada et al. 2003;Vassilev et al. 2004). Drugs that directly interfere with coactivator binding or formation of the AR N/C interaction would likely inhibit AR activity, perhaps even in androgen-resistant prostate cancers in which conventional therapies have failed. Strategies for designing AR coactivator antagonists are revealed in spite of the changes to the structure at the interface. Together the þ1, þ4, and þ5 subsites contribute the majority of buried surface area of the peptide-LBD interaction (Table 2). Inhibitors may be designed by varying hydrophobic constituents at these hotspots. The þ1 and þ5 subsites of AR have a unique preference for aromatic side chains and provide the most viable starting points for designing AR-specific inhibitors. Aromatic groups, possibly with polar constituents to exploit hydrogen bonding interactions with Gln733 and Gln738 in the þ1 and þ5 subsites, respectively, may provide promising leads. Indeed, initial screens have yielded compounds that bind to the þ1 subsite in such a manner (E. Estébanez-Perpiñ á , personal communication). Poorly conserved binding and a lack of strong structural features at the þ4 subsite suggest that this site may be incorporated for achieving other characteristics important for inhibitors besides fit. Synthetic strategies that link together groups that bind with moderate affinity to the þ1, þ5, and possibly þ4 subsites may yield tight binding inhibitors of AR coactivator association.

Materials and Methods
Protein purification. Expression and purification of the AR LBD for crystallization were performed essentially as described (Matias et al. 2000). The cDNA encoding the chimp AR LBD (residues 663-919human numbering), which displays 100% identity to the human form in protein sequence, was cloned into a modified pGEX-2T vector (Amersham Biosciences, Piscataway, New Jersey, United States) and expressed as glutathione S-transferase (GST) fusion protein in the E. coli strain BL21 (DE3) STAR in the presence of 10 lM DHT. Induction was carried out with 30 lM IPTG at 17 8C for 16-18 h. E. coli cells were lysed in buffer (10 mM Tris, [pH 8.0], 150 mM NaCl, 10% glycerol, 1 mM TCEP, 0.2 mM PMSF) supplemented with 0.5 lg/ml lysozyme, 5 U/ml benzonase, 0.5% CHAPS, and 10 lM DHT. All buffers for further purification steps contained 1 lM DHT. Soluble cell lysate was adsorbed to Glutathione Sepharose 4 Fast Flow resin (Amersham Biosciences), washed with buffer containing 0.1% n-octyl b-glucoside, and eluted with 15 mM glutathione. After cleavage of the GST moiety with thrombin, final purification of the AR LBD was carried out using a HiTrap SP cation exchange column (Amersham Biosciences). Eluted AR LBD was dialyzed overnight at 4 8C against buffer containing 50 mM HEPES (pH 7.2), 10% glycerol, 0.2 mM TCEP, 20 lM DHT, 150 mM Li 2 SO 4 , and 0.1% n-octyl b-glucoside, then concentrated to greater than 4 mg/ml for crystallization.
Purification of AR LBD for use in phage affinity selection was carried out as above without the final dialysis and concentration steps. The expression construct contained the AR LBD as an inframe fusion with GST in a modified pGEX-2T vector containing both a flexible region and an AviTag sequence (Avidity, Denver, Colorado, United States) allowing in vivo biotinylation. The GST-AR LBD fusion expression plasmid was cotransformed with a plasmid-encoding E. coli biotin ligase (Avidity) into BL21 (DE3) STAR cells. Protein expression was carried out as above but with induction supplemented with 50 lM biotin to ensure quantitative biotinylation of AR LBD.
Phage affinity selections and peptide identification. Phage affinity selections were performed essentially as described . Biotinylated AR LBD (10 pmol/well) was incubated in streptavidincoated Immulon 4 96-well plates (Dynatech International, Edgewood, New Jersey, United States) in TBST (10 mM Tris-HCl [pH 8.0], 150 mM NaCl, 0.05% Tween 20) with 1 lM DHT for 1 h at 4 8C. Affinity selections were performed in TBST containing 1 lM DHT. M13 phage distributed among 24 libraries displaying a total of greater than 2 3 10 10 different random or biased amino acid sequences were added to the wells containing immobilized AR LBD and incubated for 3 h at 4 8C. After washing, bound phage were eluted using pH 2 glycine. Enrichment of phage displaying target-specific peptides was monitored after each round of affinity selection using an anti-M13 antibody conjugated to horseradish peroxidase in an ELISA-type assay.
Synthetic peptides corresponding to the deduced amino acid sequences from receptor-specific phage were tested for their ability to interact with purified AR LBD using a FRET-based assay format. Peptides were synthesized according to the deduced amino acid sequence displayed on phage with an additional C-terminal amino acid sequence consisting of SGSGK to allow the attachment of a biotin tag (Anaspec, San Jose, California, United States). Flourophor conjugates were prepared by incubating either biotinylated peptides with streptavidin-cryptate (Cis Bio International, Bagnols Sur Ceze Cedex, France), or biotinylated AR LBD with streptavidin-XL665 (Cis Bio). Interaction between peptide and AR LBD was monitored by the ratio of energy transfer by excitation at 320 nm and emission at 625 nm and 665 nm.
Surface plasmon resonance. Affinities of peptides to the AR LBD were determined with a Biacore (Piscataway, New Jersey, United States) 2000 instrument. A peptide derived from silencing mediator for RXR and TR 2 (SMRT2) served as a negative control. 1 mM peptide stock solutions in DMSO were diluted into HBS-P buffer (10 mM HEPES [pH 7.4], 150 mM NaCl, 0.005% Surfactant P20) to generate 10 lM working solutions. HBS-P buffer was flowed through the cells to achieve a stable baseline prior to immobilization of the biotinylated peptides. To achieve the binding of approximately 250 RU of peptides to individual cells, working solutions of peptides were diluted to 100 nM in HBS-P buffer. Unbound streptavidin sites were blocked by injection of a 1 mM biotin solution at a rate of 10 ll/min.
Purified AR LBD was diluted into HBS-P buffer to a concentration of 10 lM and injected into all four Flowcells using the Kinject protocol at a flow rate of 10 ll/min (contact time 360 s, dissociation time 360 s). Following the dissociation phase, the surface of the chip was regenerated to remove residual AR LBD by QuickInject of buffer containing 10 mM HEPES and 50% ethylene glycol (pH 11). Following the establishment of a stable baseline, the same procedure was repeated using a series of AR LBD dilutions (5 lM, 1 lM, and 300 nM) in an iterative manner. Analysis of the data was performed using BIAevaluation 3.0 software (Biacore). The SMRT2 signals were subtracted as background from the three remaining peptide signals. Data were best fit using the two-state conformational change model (Warnmark et al. 2001(Warnmark et al. , 2002. Crystallization, data collection, and refinement. Purified, concentrated AR LBD was combined with 3x to 6x molar excess of peptide and incubated 1 h at room temperature before crystallization trials. Complexes were crystallized using the hanging drop vapor diffusion method. Protein-peptide solution was combined in a 1:1 ratio with a well solution consisting of 0.6-0.8 M sodium citrate and 100 mM Tris or HEPES buffer (pH 7-8). Crystals typically appeared after 1-2 d, with maximal size attained within 2 wk. For data collection, crystals were swiped into a cryo-protectant solution consisting of well solution plus 10% glycerol before flash freezing in liquid nitrogen. The addition of ethylene glycol to a well concentration of 10%-20% was later found to both improve crystal quality and enable the freezing of crystals directly out of the drop.
Datasets were collected at 100K at the Advanced Light Source (Lawrence Berkeley Laboratory, Berkeley, California, United States), beamline 8.3.1, with either a ADSC Quantum 315 or Quantum 210 CCD detector. Data were processed using Denzo and Scalepack (Otwinowski and Minor 1997). Molecular replacement searches were performed with rotation and translation functions from CNS (Brunger et al. 1998). Initial searches for AR-FxxLF were performed using the structure of AR-R1881 (PDB: 1E3G) with R1881 omitted from the search model. Subsequent searches for all other complexes were performed using the refined LBD structure from the AR-FxxLF complex. To minimize the possibility of model bias, FxxLF peptide and DHT were omitted from all molecular replacement searches. Protein models were built by iterative rounds of simulated annealing, conjugate gradient minimization, and individual B-factor refinement in CNS followed by manual rebuilding in Quanta 2000 (Accelrys, San Diego, California, United States) using r A -weighted 2F o À F c , F o À F c , and simulated annealing composite omit maps. Superposition of structures was performed with LSQMAN (Kleywegt 1996). Buried surface area calculations were performed with CNS. All figures were generated with PyMOL (DeLano 2002). Coordinates and structure factors for all complexes have been deposited in the Protein Data Bank. Accession numbers are listed in Table 2.