Evaluating Caveolin Interactions: Do Proteins Interact with the Caveolin Scaffolding Domain through a Widespread Aromatic Residue-Rich Motif?

Caveolins are coat proteins of caveolae, small flask-shaped pits of the plasma membranes of most cells. Aside from roles in caveolae formation, caveolins recruit, retain and regulate many caveolae-associated signalling molecules. Caveolin-protein interactions are commonly considered to occur between a ∼20 amino acid region within caveolin, the caveolin scaffolding domain (CSD), and an aromatic-rich caveolin binding motif (CBM) on the binding partner (фXфXXXXф, фXXXXфXXф or фXфXXXXфXXф, where ф is an aromatic and X an unspecified amino acid). The CBM resembles a typical linear motif - a short, simple sequence independently evolved many times in different proteins for a specific function. Here we exploit recent improvements in bioinformatics tools and in our understanding of linear motifs to critically examine the role of CBMs in caveolin interactions. We find that sequences conforming to the CBM occur in 30% of human proteins, but find no evidence for their statistical enrichment in the caveolin interactome. Furthermore, sequence- and structure-based considerations suggest that CBMs do not have characteristics commonly associated with true interaction motifs. Analysis of the relative solvent accessible area of putative CBMs shows that the majority of their aromatic residues are buried within the protein and are thus unlikely to interact directly with caveolin, but may instead be important for protein structural stability. Together, these findings suggest that the canonical CBM may not be a common characteristic of caveolin-target interactions and that interfaces between caveolin and targets may be more structurally diverse than presently appreciated.


Introduction
Caveolins are a family of cholesterol-binding membrane proteins (caveolin-1, -2 and -3) that coat the intracellular surface of caveolae, small flask-shaped pits (50-100 nm in diameter) that form at the plasma membrane of most cells [1][2][3][4]. Aside from roles in caveolae formation and stability, caveolins interact with many caveolae-localized signalling molecules including heterotrimeric G proteins, Src family tyrosine kinases, phosphoinositide 3-kinase, integrins, epidermal growth factor receptor (EGFR), H-Ras, endothelial nitric oxide synthase (eNOS) and a number of ion channels [3,5]. Interaction with caveolin, which appears to be important in protein recruitment to caveolar domains and thus the formation of microenvironments rich in interacting signalling molecules, is commonly believed to be mediated via a ,20 amino acid N-terminal region on the caveolin molecule known as the caveolin scaffolding domain (CSD) and an aromatic-rich caveolin binding motif (CBM) on the associated protein [6,7]. Paradoxically, association with caveolin typically suppresses activity in the targeted protein [6,7], suggesting that recruitment to caveolae might hamper and not enhance signalling efficiency (the so-called 'caveolar paradox'). This paradox has been largely resolved for eNOS whereby interaction with caveolin under basal conditions maintains an inactive enzyme and compartmentalization of eNOS in caveolae ensures a rapid response upon stimulation [8].
Interactions between caveolin and other proteins, however, remain poorly understood in terms of physiology, modes of binding/suppression and the mechanisms that regulate interaction.
Since the original definitions of the CSD and CBM, an increasing number of studies have suggested that interactions between caveolin and target need not necessarily involve both regions. Association of caveolin with NOSTRIN [9], cyclooxygenase-2 [10], high affinity nerve growth factor receptor (Trk [11]), growth factor receptor-bound protein 7 (Grb7 [12]) and insulin receptor substrate 1 (IRS1 [13]) are all thought to occur independently of the CSD. Furthermore, in some cases interactions appear to occur via multiple distinct caveolin domains. For example, interaction with protein kinase A is dependent on either the CSD or C-terminal domain (amino acids 135-178) of Cav-1 [14]. Dynamin-2, endothelin-B, connexin-43 and Rab5 also interact with multiple distinct regions of Cav-1 [15][16][17][18]. Target association with the caveolin scaffolding domain is mainly proposed to occur via the caveolin binding motif (CBM) on the binding partner. The original definition of the CBM arises from the work of Couet et al., who obtained random peptides binding to the CSD by phage display [6]. The peptides obtained were statistically enriched in tryptophan (decapeptides and 15-mers) or other aromatic amino acids (15-mers). Noting that certain separations of aromatic residues were particularly common, the authors identified a 16-residue portion of the bovine Gi2a subunit (the GP peptide) which bound to CSDs from caveolin-1 and 3 and much less so to caveolin 2. When all four aromatic residues were simultaneously mutated to Ala or Gly the interaction was lost. Based on this finding three CBM variants were defined, each containing three or four aromatic residues separated by unspecified amino-acids (CBMs: FXFXXXXF, FXXXXFXXF or FXFXXXXFXXF, where F is an aromatic amino acid), and shown to occur in known or possible caveolin binding proteins. Although the notion of these aromatic-rich motifs has figured prominently in the literature, the fact that the four aromatic positions in the caveolin binding peptide were not independently mutated means that there is no reason to suppose that all four should invariably be present in CBM sequences. Equally, the quadruple mutation would be expected to have dramatic effects on any tertiary structure that the GP peptide might have, raising doubts as to whether the aromatic residues function in direct binding or have an indirect role in stabilising the active peptide conformation.
There are several cases where binding to caveolin occurs entirely independently of a typical CBM. For example, Sproutyprotein 1, which lacks a CBM, binds Cav-1 via its conserved cysteine-rich C-terminal domain, an interaction which is completely eliminated by a single amino acid exchange mutation (R252D [19]). Hepatocyte cell adhesion molecule (hepaCAM), binds Cav-1 via the first immunoglobulin domain which also lacks a traditional CBM domain [20]. Binding of Cav-1 to DNAbinding protein inhibitor, ID-1, occurs via a helix-loop-helix domain, a region lacking a typical CBM [21]. The catalytic domain of protein kinase A (PKAcat), nerve growth factor receptor, and sterol carrier protein also bind Cav-1, despite lacking CBM sequences [21]. Furthermore, there are also cases of proteins containing CBMs that do not bind to caveolin: both RhoA and RhoB have identical CBM sequences, yet only the former localises with Cav-1 in caveolae [22]. Likewise, an 'incomplete CBM' is also found in low molecular weight protein tyrosine phosphatase ( 78 ITKEDFATF 86 ) but is not recognised as the binding site for Cav-1 [23]. Together, these findings suggest that the CBM, like the CSD (see above) is not necessarily required for all caveolin interactions. At this point it should be noted that, although many caveolin binding proteins have been described, in many cases it is unclear if these are direct interactions or whether they are facilitated indirectly via intermediary molecules of a larger caveolin-containing complex. Thus, it is possible that regions predicted to be crucial for caveolin interaction (including sequences resembling CBMs) may function by binding intermediary molecules which then recruit caveolin.
The CBM, as proposed, is a prime example of a short, linear motif (SLiM) -a simple sequence that would have independently evolved many times in different proteins for a specific function, in this case binding to the CSD. Until recently the fundamental role of such motifs in mediating the protein-protein interactions underlying cellular regulation and signalling has been underappreciated. Such SLiMs have presented significant bioinformatics challenges. However, recent years have seen major advances in detection of interaction motifs through their over-representation in interactome sequences [24][25][26], benefiting especially from knowledge that SLiMs tend to be conserved and positioned preferentially in intrinsically disordered parts of proteins [26]. Other recently developed methods use these criteria and others, such as predicted solvent exposure and secondary structure [27] or energetic factors [28], to predict potential motifs in single sequences. Weatheritt et al. [29] have also described a method to identify SLiM interaction interfaces for both interacting proteins. Here we exploit these recent improvements in bioinformatics techniques available for the study of linear motifs to critically examine the role of aromatic-rich CBMs in caveolin interactions. We assess their frequency of occurrence in the human proteome, their statistical enrichment in the caveolin interactome and shared characteristics with other known interaction motifs. We examine the relative solvent accessible area (RSA) of the CBM aromatic residues for Cav-1 interaction partners in solved crystal structures and homology models to assess the likelihood that the conserved aromatics are available for direct binding of proteins. Finally, we calculate the predicted DDG free energy stability change resulting from point mutations of the aromatic residues to examine their role in protein stability. Our findings suggest that the CBM, despite its prevalence in the caveolin literature, is not required for all caveolin interactions and may in fact only be genuinely implicated in a small minority of cases. This conclusion is significant for future caveolin research.

Experimental Evidence Regarding CBMs as Mediators of Caveolin Interaction
Aromatic-rich putative CBMs have been identified in numerous caveolin associated molecules (Table 1). In some work large aliphatic residues such as Leu are considered as substitutes for the aromatic positions (Table 2). For a few proteins there is some supporting evidence demonstrating that the putative CBM mediates interaction with the CSD (i.e. targeted mutation of the CBM disrupts caveolin binding). For example, deletion of the entire CBM ( 1130 YNMLCFGIY 1138 ) of the large conductance, voltage-and Ca 2+ -activated potassium channel a-subunit (Slo1) causes ,80-85% loss of Slo1-Cav-1 association [30]. Some authors have also reported active roles for the individual aromatic residues of CBMs. For example, simultaneous mutation of all three aromatics ( 376 WSFAVLLW 383 ) in the integrin-linked protein kinase abolishes Cav-1 binding [31]. Two serine/threonineprotein kinase receptor R3 CBM mutants (W406A; F401G and W406A) also exhibit substantial reduction in co-immunoprecipitation with Cav-1 [32]. Kong et al. [33] created several D(1A) dopamine receptor mutants with disrupted proximal, central and distal CBM aromatic residues which exhibited reduced binding affinity for caveolin. Point mutation of just one or all three CBM aromatics of ephrin type-B receptor 1 (EphB1) receptor also severely reduced receptor co-immunoprecipitation with Cav-1 [34]. Glucagon-like peptide 1 receptor also fails to interact with caveolin following mutation of two tyrosine residues within the motif [35]. Site-directed mutagenesis of metabolic glutamate receptor, 3-phosphoinositide-dependent protein kinase 1 (PDK1), phosphatidylinositol-3,4,5-trisphosphate 3-phosphatase, dual-specificity protein phosphatase (PTEN), and sialidase also suggests that interaction with Cav-1 is mediated by the CBM [36][37][38][39]. Similarly, haem-oxygenase-1 possesses an incomplete CBM motif ( 227 FLLNIQLF 234 ) and completely loses affinity for the Cav-1 CSD following mutation of the motif's two Phe residues (F227 and F234 [40]). However, in the main, there seems to be little unambiguous evidence that these motifs, and crucially the positioning of their aromatic amino acid residues, are generally required for caveolin interactions. Several examples were mentioned in the introduction of proteins in which caveolin interaction has proved to be independent of any CBM-like sequence. In other examples, mutagenesis of putative CBMs fails to show a substantial effect on caveolin interaction. For example, a W1227T mutant that disrupts the CBM of the insulin receptor ( 1220 WSFGVVLW 1227 ) still exhibits significant interaction with Cav-1 [41]. Moreover, simultaneous mutation of the CBM aromatic residues Y42A and W45A of the multidrug resistance protein-1 (MDR1) only diminishes interaction with Cav-1 by 27% [42]. It seems highly unlikely that the MDR1 CBM could still function as such a potent interface for Cav-1 binding while possessing just one remaining functional motif residue, which strongly implicates non-CBM residues as the mediators of Cav-1 binding. Furthermore individual F589L and W592L mutations of the neuronal nitric oxide synthase (nNOS) CBM resulted in only slight reductions of the Cav-1 inhibitory effect (IC50 values of 3.5 and 3.0 mM respectively compared to 1.8 mM for the wild-type protein) suggesting that the motif is also not essential for Cav-1 binding to nNOS [43]. Similarly, despite deletion of the Slo1 CBM greatly reducing Cav-1 interaction, individual point mutation of the aromatics within the motif has a less obvious effect on binding [30]. Whereas F1135A or Y1138A mutations decrease Cav-1-Slo1 association by only ,15% each, Y1130A increases the interaction by ,40%. Furthermore, a triple mutation, where all aromatics were mutated, had practically no impact on Cav-1-Slo1 association, suggesting that the mutations had an additive effect and also indicating that other residues within or around the motif stabilize the interaction [30]. The idea that neighbouring residues can also be important is supported by Syme et al. [35] who demonstrated that interaction between Cav-1 and the glucagon-like peptide 1 receptor was inhibited by mutation of two aromatic within the proposed CBM (Y250/ 252A) but also by mutation of a nearby glutamate residue (E247A) outside of the CBM. However, Brainard et al. [44] have published contradictory evidence to Alioua et al. [30] demonstrating that mutation of all three of the Slo1 CBM aromatics is sufficient to completely abolish Cav-1 interaction.
Clearly, the existence of a putative CBM sequence in a protein which binds to caveolin offers no guarantee of its involvement in binding. Nevertheless, as previously discussed, there are several examples where mutagenesis of the putative CBM leads to altered behaviour, although these cases are in the minority (Table 1). Unfortunately, it is not common practice to verify the folded state of the mutant protein. We consider below (see later) whether mutation of aromatic residues in putative CBM sequences may affect function through destabilisation of the protein fold, rather than the binding role often inferred.

CBM Sequences are Abundant in the Human Proteome
The more specific a motif, the less frequently it will arise by chance during evolution. Conversely, very simple motifs will arise frequently by chance so that discovering that one is commonly found among a group of functionally related proteins -the caveolin interactome, for example -becomes less significant in itself. We therefore searched the human proteome for the CBM motifs. Strikingly, this analysis shows that ,30% of all proteins contain at least one instance of a X XXXX or XXXX XX sequence. This number increases to 69% by allowing substitution of either I or L at one of the aromatic positions. It is highly likely that the majority of these proteins have no interaction with caveolin and that many proteins possess putative CBMs by chance. Consequently, identification of a CBM within a protein may not be strong evidence to suggest a direct interaction with caveolin.

Aromatic-containing Motifs are not Significantly Enriched in the Caveolin Interactome
The high frequency of the CBM motifs in the human proteome does not, of course, mean that they may not serve in some proteins for interaction with caveolin. If that were the case, a statistically higher occurrence of the CBM motifs in the caveolin interactome, compared to proteins in general, would be expected. We therefore used the web-based short linear motif (SLiM) discovery service, SLiMFinder, to search for any over-represented motifs (CBM-like or novel) among the Cav-1 interactome. The complete Cav-1 interactome used in this study can be found as supporting data (Table S1). SLiMFinder is a probabilistic web server program for identification of SLiMs in proteins with a common attribute (such as a common interaction partner) and for estimating the probability of returned motifs arising by chance [25,45]. Caveolin 1 was chosen for this analysis since, compared to the other two isoforms, it has the most abundant interaction data. The available interactome data for Cav-2 and Cav-3 was considered too small to derive statistically meaningful information and was therefore not included in this study. The sequences of 135 proteins with multiple experimentally-demonstrated interactions with Cav-1 were collected by surveying databases such as IntAct v.3.1, BioGrid 3.1 , and APID-beta and from the literature. The SLiMFinder web-server was run on this dataset, altering search parameters in order to ensure that motifs matching the original CBM definitions would be returned if statistically significantly enriched. SLiMFinder returned just one SLiM ([ST].[LV]$; where $ represents the Cterminus) below the default significance threshold of 0.05 [45]. This was present in only 11 proteins and is an already known motif (LIG_PDZ_Class_1 in the ELM database [46]) specifying interaction with PDZ domains. Even restricting the dataset to 64 proteins identified in the literature to contain a CBM, failed to return any motifs resembling the CBMs. Furthermore, CBM-like or aromatic-rich motifs were not returned for either data set even at higher, non-significant e-values (up to a threshold cut-off of 0.99).
As SLiMs tend to occur in disordered regions of proteins [47], the SLiMFinder webserver, by default, masks out regions predicted to be ordered by IUPred [48] which thus excludes them from further analysis and improves performance. Consequently, CBMs which occur in domains with predicted higher order (e.g. the tyrosine kinase domain of insulin receptor [49] and catalytic domain of protein kinase A [14]) are likely removed from the motif discovery process. To see if their inclusion affected motif discovery, disorder masking was deactivated and a SLiMFinder run was repeated for the datasets. However, CBM-like motifs were once again absent from the list of statistically significant and insignificant motifs. This suggests that the aromatic-rich CBMs are not statistically over-represented in proteins known to interact with Cav-1.

CBMs Identified in the Literature Lack the Characteristics of SLiMs
Most SLiMs share a set of characteristics including a tendency to be located in surface accessible intrinsically disordered regions, a high degree of conservation relative to the local background sequence, and a tendency to contain residues with greater likelihood to undergo order-disorder transitions [45,47]. It is therefore possible to computationally predict regions where motifs are likely to occur from a protein's primary sequence. We therefore applied SLiMPred, a recent de novo web-based programme designed to predict SLiMs from both ordered and disordered protein sequences independently of experimentally defined homologues and interactors [27], to see if putative CBMs coincide with regions predicted to have these SLiM-like characteristics. The analysis was limited to include only proteins with experimental evidence to suggest that the CBM is involved in binding to caveolin. The SLiMPred algorithm bases its predictions on annotated instances from the Eukaryotic Linear Motif database, as well as structural, biophysical, and biochemical features derived from the protein's primary sequence, and assigns each residue of a protein with a probability value between 0 and 1, with residues scoring closer to 1 most likely belonging to a SLiM. A threshold for residues to be considered a SLiM residue was set at 0.1, at which there exists a balance between a reasonable true-, and a low false-positive rate (44 and 22% respectively [27]). Values for CBM aromatic residues are listed in Table 3. Even with such a low cut off point, only ,36% of CBM aromatic residues were predicted to be part of a motif and thus able to facilitate proteinprotein interactions (Table 3). In only four of 28 CBM cases were all three aromatic residues so predicted, while in 13 cases none of the three aromatic residues gave a positive prediction. Furthermore, of the CBM aromatic residues which were scored highly by SLiMPred, 68% coincide with known functional motifs unrelated to caveolin binding. For example, SLiMPred matched Y786 and F789 of the short transient receptor potential channel 1 (TrpC1) CBM to the previously discovered TRG_ENDOCYTIC_2 motif, which is a tyrosine-based sorting signal responsible for interaction with the mu-subunit of the AP (adaptor protein) complex. It is not however known whether this is a functional motif for TrpC1. SLiMPred scores for the entire stretch of CBM residues, including non-defined and non-functional positions, are available as supplementary information (Table S2). Overall, these tests indicate that most published examples of CBMs in proteins binding caveolin lack the characteristics of known functional SLiMs.

CBM Aromatic Residues are Mostly Unavailable for Caveolin Interaction
The aromatic residues of the defined CBMs are largely hydrophobic, especially Phe, and so are most commonly found buried in the structural core of proteins. Surface exposure of such residues to allow interaction with other molecules is known, as in carbohydrate-binding proteins for example [50], but is uncommon. For the CBM sequence, and specifically the aromatic residues, to function in situ within the Gi2a protein for binding caveolin (as first described by Couet et al. [6]), it and they must be accessible for interaction. The nearest relatives of the Gi2a protein with known structures are rat and human Gi1a sequences, which are sequence-identical in the vicinity of the CBM. Figure 1 shows the position of the CBM in the highest resolution structure of a native Gi1a sequence (rat Gi1a PDB code 1CIP; [51]). The motif adopts a b-hairpin structure, extensively hydrogen bonded to a third strand. Of the four aromatic positions, only the second and fourth are significantly solvent-exposed, and their positions on opposite sides of the hairpin ensure that simultaneous interaction of both with caveolin is unlikely (Figure 1). Clearly, in the conformation captured by crystallography, two of the four aromatic residues are unavailable for inter-molecular interaction.
Although substantial conformational changes in the region are rendered unlikely by the embedding of the b-hairpin structure in a three strand b-sheet, we sought evidence that such a transformation is possible in two ways: by assessing conformational variability among other structures and by conformationally simulating the main modes of dynamics using an elastic network model [52]. Figure S1 shows a comparison of all available rat and human Gi1a structures in the CBM region, showing that the position of the aromatic residues is essentially the same in each. Figure S2 shows the same region in a broader selection of G proteins in which at least three of the four aromatic positions are present. Again, the b-hairpin and three-stranded sheet are structurally conserved and where aromatic residues are found at positions corresponding to those in Gi1a they are similarly generally buried. Finally, we predicted the major conformational modes of Gi1a using the AD-ENM server. None of the largest 10 predicted motions impacts significantly on the CBM and the b-hairpin. For illustration, the motion leading to the largest structural variation in the motif region (eigenvector 8) is shown in Figure S3 where its maximum and minimum projections are superimposed on the crystal structure. Once again the hydrogen-bonding between the bhairpin and third strand is stable ensuring that all aromatic residues maintain similar, largely buried conformations. Side chains are not treated by the AD-ENM analysis. These considerations lead us to conclude that it is difficult to imagine interaction of CSD with the CBM in Gi2a, as visualised crystallographically, involving more than one or two of its aromatic residues. Furthermore, there is no apparent support for the idea that the region is particularly conformationally flexible and thus capable of adopting radically different structures in which multiple aromatics would be suitably exposed and arrayed for interaction with the CSD. Moreover, the crystal structures of other known caveolin binding proteins with proposed functional CBMs (EGFR, insulin receptor, integrin-linked protein kinase, PDK1, PTEN and Slo1) also suggest that CBM residues are largely buried ( Figure S4). To see how general an issue accessibility could be for the CBM hypothesis, we measured solvent accessibility of aromatic residues in CBMs of Cav-1 interacting proteins in situ. Experimental structures of the proteins were preferentially used for this analysis. For proteins where structures were unavailable, homology models were used if template availability allowed. The relative solvent accessible areas (RSAs) of the CBM aromatic residues were calculated as previously described ( [53]; see Methods for details) and are listed in Table 4. It is worth noting that these values will in some cases be overestimates of solvent accessibility since some experimental structures will be of isolated domains, not complete proteins, and some homology models may also be incomplete. For instances where no structure was available, SABLE [54] was used to estimate the RSA ( Table 5). The resulting data (Tables 4 and 5) strongly indicate that the majority of CBM aromatic residues are buried (RSA,20% [53]) within the protein, and are thus unavailable for interaction directly with caveolin or with a third protein mediating an indirect interaction with caveolin. Notably, for the data set including experimental and model structures, only three out of 57 CBMs were predicted to contain three solventexposed aromatic residues, those of insulin-like growth factorbinding protein 3 (IBP-3), Kv1.3 and Kv1.5 (Table 4) Table 4 also shows the secondary structure at each of the aromatic positions within the putative CBMs. It is notable that the secondary structure context varies widely, contrary to what would be expected if each of these sequences bound to caveolin in a similar manner.
The burial of CBM aromatic positions, rendering them unavailable for interaction, apparently conflicts with the findings of the numerous authors discussed earlier who demonstrate that CBM mutation severely disrupts protein interactions with caveolin. However, in these examples, data are very rarely presented to demonstrate that the protein folding is unaffected by the mutation. This offers an alternative explanation for situations in which aromatic residues are buried and unavailable for interaction yet their mutation affects interaction with caveolin: the aromatic residues are critical for protein stability [53] and their mutation leads to destabilisation of the protein fold and knock-on effects on the caveolin interface. We used PoPMuSiC, which accurately predicts values of DDG free energy stability change resulting from point mutations [55,56], to anticipate the potentially deleterious effects of CBM aromatic substitution with alanine, the most common mutation experimentally chosen. In nearly all cases, mutation of a buried CBM aromatic had a predicted significant destabilizing effect on the protein (.2.0 kcal/ mol; Table 6 [57]). Considering that the majority of CBM aromatics are buried, it is likely that experimental mutation of these residues would impair protein stability, which may explain observed abrogation of caveolin interaction in some or even most cases. Indeed, most CBMs are highly conserved in sequence (Table S3) consistent with the idea that their aromatic residues are important determinants of protein structure. Some experimental data support this idea. For example, mutations of insulin receptor CBM aromatics result in poorly expressed mature constructs at the cell surface, impaired autophosphorylation, and accelerated degradation of the proreceptor [41,[58][59][60] which is consistent with the notion of buried aromatics being important structural factors. F313A and W318A mutation of the putative D(1A) dopamine receptor CBM resulted in a protein with similar pharmacological properties and surface expression as the wild-type receptor, but which had lost its ability to bind to Cav-1 [33]. Whereas these two amino acids are relatively exposed (RSAs of 31 and 22% respectively; Table 4) and may contribute to a real binding site for caveolin, the final aromatic position of this CBM, W321, is deeply buried (RSA = 3%) and its mutation to alanine is consequently predicted to have the strongest destabilizing effect of the three aromatics (3.33 kcal/mol). Accordingly, Kong et al. [33] reported that the W321A mutant exhibited strongly attenuated surface expression and pharmacological activity, indicative of protein misfolding. Thus, it is unlikely that all three of the CBM aromatics participate in the interaction with caveolin. Furthermore, mutation of nNOS F589 and W592 residues to Leu only partially abrogates interaction with Cav-1 [43]. Suggestively, such mutations are predicted to have a less severe destabilising effect (1.04 and 1.36 kcal/mol for F589L and F592L respectively) than mutation to alanine, which may explain the retained Cav-1 binding.
Although we assert that the general burial of putative CBMs in known and model structures argues against their having functionality, there is the possibility of CBM sequences exerting their function before the protein in which they are embedded achieves its final conformation. Thus, Wyse et al. [61] demonstrated that, despite not forming a complex with caveolin in the caveolae, expression of Cav-3 and an intact CBM of type 1 receptor for angiotensin II (AT1-R) are critical for the correct trafficking and localisation of the receptor to the cell surface, as AT1-R is found exclusively in the ER in caveolin-deficient cells and following mutation of each CBM aromatic. This was explained by Cav-3 binding to AT1-R during the initial stage of AT1-R maturation in the ER, and serving as a chaperone to shuttle the receptor to the plasma membrane. Although only one of the CBM aromatics is exposed in the mature receptor (F304 ; Table 4), the CBM as a whole may be in a more accessible conformation within the ER before the receptor reaches its final natively folded structure. Caveolin has also been identified as a transport chaperone for  [6] seen in the rat Gi1a protein (PDB code 1CIP; [51]). The bhairpin structure of the motif is shown as a cartoon, coloured from blue to red, and the aromatic residues drawn as sticks (Phe189 is blue, Phe191 is cyan, Phe196 is yellow and Phe199 is red). The third strand of the three-stranded sheet to which the motif belongs is also shown in pink. The remainder of the protein is shown as lines and surface, the latter coloured green where contributed by side chains of the aromatic residues. doi:10.1371/journal.pone.0044879.g001  Leukemia inhibitory factor receptor F323 23.8 (E) F328 7.6 (E) glycosylphosphatidylinositol-anchored proteins, which are only surface expressed in the presence of Cav-1 or Cav-3 [62]. Interestingly, the CBM is reminiscent of another possible motif recognised by the chaperone BiP, found in the endoplasmic reticulum. The BiP recognition motif is Hy(W/X)HyXHyXHyX-HyX, where Hy is a large hydrophobic amino acid, most frequently Trp, Leu, or Phe, and X is any amino acid. The comparison was already made by Couet et al. [6] but they argued for a role as a 'membrane chaperone' whereas the data published since opens up the possibility of caveolin functioning in the ER en route to the plasma membrane. This potential chaperone aspect of caveolin function clearly merits further investigation.

Discussion
Since the original definition of the CBM was proposed by Couet et al. [6] the notion of these aromatic-rich motifs has become firmly embedded in the literature. However, since these early experiments, greater structural information has become available for potential caveolin binding proteins. Taking advantage of this and recent advances in bioinformatics methodologies, we have  Table 5. SABLE estimates of relative exposed surface area (RSA) of CBM aromatics.  Furthermore, a complete CBM is rarely expressed at the surface of a protein as the bulk of CBM aromatics are buried and as such would be an unsuitable interface for protein binding. The often demonstrated requirement for an unperturbed CBM for caveolin binding may instead relate to its function -and in particular the role of the aromatic residues -in determining the structure of caveolin target proteins.
As the aromatic residues of Gi2a protein derived peptides were not individually mutated in the original work of Couet et al. [6], there is little reason to suppose that a motif of this arrangement would invariably be required for caveolin binding. In this regard it is noteworthy that many authors have presented evidence to suggest that proteins lacking CBMs or with incomplete or CBMlike motifs interact with caveolin. It is interesting that the CSD and the predicted consensus for caveolin binding motifs are both aromatic-rich sequences. In the original experiments of Couet et al. [6], binding of the CSD with the aromatic-rich Gi2a protein derived peptides would likely have been due to p-stacking of aromatic amino acid side chains. Therefore, although the concept of a traditional CBM, where all three aromatic residues are a necessity for caveolin interaction, may not be physiologically relevant due to residue inaccessibility, these early experiments indicate that the CSD may have high propensity for hydrophobic and p-stacking interactions. For example, Yue & Mazzone [63] observed that human apoE is enriched in aromatic amino acids in a non-CBM configuration between residues 44 and 63, and demonstrated that a biotin-labelled peptide of 20 residues containing this region binds Cav-1 from adipocyte lysates. Furthermore, in a CSD-PKAcat structural model, the CSD is predicted to extend across PKAcat and make contacts with several surface-located hydrophobic and aromatic residues (P244, I245, Y248) in addition to hydrogen bonding interactions [64].
In summary, we argue that the notion of aromatic-containing CBMs has taken an unwarranted hold of the literature. Dangers lie in mutating aromatic residues, often key for defining the protein fold, then ascribing a direct binding role to the mutated positions without checking the structural integrity of the mutant protein. Furthermore, our analysis underscores the urgent need for experimental structural information of a complex between caveolin (or a suitable peptide) and a protein partner.

Materials and Methods
Proteins with experimentally-demonstrated interactions with Cav-1 were collected by surveying the protein-protein interaction databases IntAct v.3.1, BioGrid 3.1 , and APID-beta [65][66][67] in conjunction with literature searches. The complete Cav-1 interactome (including proteins with multiple experimentally demonstrated interactions with Cav-1 and CBM containing proteins) compiled for this study (including Uniprot accession numbers) can be found as supporting data (Table S1). Shared motifs between caveolin-interacting proteins were sought with the SLiMFinder webserver (with or without disorder masking) using UniProt IDs as the input. Default SLiMFinder settings were altered to enable SLiMs containing up to six total wildcard positions and four consecutive wildcard positions to be included in the search criteria (disorder masking activated). In this way, CBMs corresponding to the definition of Couet et al. [6] would be returned if discovered with statistical significance. In this regard, it is noteworthy that the SLiMFinder webserver will identify motifs with up to five defined (i.e. non-wildcard) positions, meaning that identification of the CBM, containing just three defined positions (i.e. the functional aromatic residues), would have been possible were it a significantly enriched motif. Only returned motifs with a significance of 0.05 were considered as confident predictions [26]. The SLiMPred webserver was used to identify amino acids predicted to be part of functional SLiMs, using a threshold cutoff SLiMPred score of 0.1 [27]. Motif instances in the human proteome were identified using ps_scan [68] and sequence data obtained from UniProt [69].
Relative solvent accessibility of aromatic residues in putative CBMs was measured and changes in folding free energy (DDG) resulting from alanine point mutation predicted using experimental structures where available. For other proteins, where suitable template structures were available, homology models from the SWISS-MODEL repository were used [70]. In brief, relative solvent accessible areas (RSAs) were calculated by dividing the water exposed surface area (in Å 2) of a residue, measured using DSSP [71], by the total surface area of the residue. Any residue with an RSA,20% was considered buried [53]. In instances where no structure, experimental or modelled, was available, SABLE [54] was used to predict the RSA. Mutant protein stability changes were predicted by the web tool PoPMuSic v2.1 [39]. MultiProt was used for protein structure superpositions [72], the AD-ENM server for elastic network model simulations [52] and PyMOL (http://www.pymol.org) for structure visualisation. For clarity, all information regarding CBM aromatic positioning presented and discussed throughout this manuscript refers to UniProt human protein sequences, and to the canonical isoform where several are known. Supporting Information Figure S1 Comparison of all available Gi1a crystal structures in the vicinity of the CBM b-hairpin. Each structure is drawn as a line and shown in a different colour. For comparison with Fig. 1, the aromatic residues of PDB code 1CIP are emphasised as green sticks . The PDB codes of other structures  shown are 1SVS, 1AGR, 1AS0, 1AS2, 1AS3, 1BH2, 1BOF, 1CIP,  1GDD, 1GFI, 1GG2, 1GIA, 1GIL, 1GIT, 1GP2, 1KJY, 1SVK,  1Y3A, 2EBC, 2G83, 2GTP, 2HLB, 2IK8, 2OM2, 2PZ2, 2PZ3 Figure S3 Comparison of the rat Gi1a protein (PDB code 1CIP; [51]; motif coloured as in Fig. 1, otherwise pink) and the maximum (black) and minimum (white) projections of normal mode 8 (see text).
(TIF) Figure S4 View of the context of the CBM of EGFR (A, PDB code 2J6M; [131]), insulin receptor (B, PDB code 3BU3; [132]) integrin-linked kinase (C, PDB code 3REP; Fukuda & Qin, to be published), PTEN (D, PDB code 1D5R; [133]), Slo1 (E, PDB code 3MT5; [134]), and the two CBMs of PDK1 (F and G, PDB code 1UU3; [135]). The structures of the motifs are shown as cartoons, coloured in green, and the aromatic residues are labelled sticks. The remainder of the protein is shown as lines and surface.