A Novel Motif Identified in Dependence Receptors

Programmed cell death signaling is a critical feature of development, cellular turnover, oncogenesis, and neurodegeneration, among other processes. Such signaling may be transduced via specific receptors, either following ligand binding—to death receptors—or following the withdrawal of trophic ligands—from dependence receptors. Although dependence receptors display functional similarities, no common structural domains have been identified. Therefore, we employed the Multiple Expectation Maximization for Motif Elicitation and the Motif Alignment and Search Tool software programs to identify a novel transmembrane motif, dubbed dependence-associated receptor transmembrane (DART) motif, that is common to all described dependence receptors. Of 3,465 human transmembrane proteins, 25 (0.7%) display the DART motif. The predicted secondary structure features an alpha helical structure, with an unusually high percentage of valine residues. At least four of the proteins undergo regulated intramembrane proteolysis. To date, we have not identified a function for this putative domain. We speculate that the DART motif may be involved in protein processing, interaction with other proteins or lipids, or homomultimerization.


INTRODUCTION
Protein evolution is rife with examples of structural and functional domains utilized by multiple proteins: for example, over 700 human proteins display SH3 domains, and over 450 proteins display PDZ domains. The identification of such common domains may hint at similarities in substructure, interactions, and potentially in function, for the various proteins displaying these domains. About 680 different domains have been documented to date in the proteomes of humans and other organisms (SMART database, http://smart.embl-heidelberg.de/, [1]), and many of these domains appear, with some variation, in numerous proteins. Alternatively, the recognition of novel motifs may link proteins previously thought to be unrelated-e.g., suggesting a common function or interaction-and therefore may aid in the determination of both structure and function for proteins that display the novel motif.
We have previously described a set of receptors that induces apoptosis following ligand withdrawal, but inhibits apoptosis following the binding of trophic ligands [2][3][4][5]. These receptors have been referred to as dependence receptors. Such receptors play roles in neural development, tumorigenesis (including metastasis), neurodegeneration, and possibly in subapoptotic events such as neurite retraction and somal atrophy [2][3][4][5].
To date, ten such receptors have been described (Table 1 and [2]). These do not share any obvious structural similarity, nor do they display similar domains required for apoptosis induction. For example, Unc5H2 features a death domain in its intracytoplasmic region, but DCC does not; instead, apoptosis induction by DCC requires a short region in its intracytoplasmic domain (residues 1243-1264) that does not bear similarity to a death domain.
In the present study, we attempted to determine whether dependence receptors as a group may indeed display a common motif(s) that had gone undetected by the initial comparisons of alignment and predicted (known) domains. We utilized the Multiple EM (Expectation Maximization) for Motif Elicitation program (MEME) (http://meme.sdsc.edu/meme/meme.html and [6]) and identified a novel motif that is featured by receptors that have been described as dependence receptors. We then searched the Swiss-Prot protein database (http://www.expasy.uniprot.org/ and [7]) using the Motif Alignment and Search Tool (MAST) program (http://meme.sdsc.edu/meme/mast.html and [8]), to determine whether other receptors or other proteins also feature this motif, and identified an additional 16 human proteins that display this motif (see Results and Discussion, below).
The novel putative motif is in a transmembrane region, and therefore was dubbed dependence-associated receptor transmembrane motif (DART motif). Here we describe the consensus sequence and discuss the possible functions of this novel motif.

Databases and software
The UniProt Knowledgebase (UniProtKB; http://www.expasy. uniprot.org/) database Release 3.3 (consisting of Swiss-Prot Release 45.3 and TrEMBL Release 28.3) from the Swiss Institute of Bioinformatics was used for this study. This database was chosen because it is well documented and allowed us to analyze the predictions on receptors.
The MEME (http://meme.sdsc.edu/meme/meme.html) software program (version 3.0; non-commercial version) was used for the identification of motifs in non-aligned sequences, where a motif is a sequence pattern that occurs repeatedly in a group of protein Ten dependence receptors plus their orthologues (32 sequences total) were used as a training set by the MEME program to search for high-scoring motifs common to all proteins. or DNA sequences. MEME saves these motifs as a set of profiles. MEME uses the method of Bailey and Elkan to identify likely motifs within the input set of sequences [6]. A range of motif widths (.15 amino acids in length) and various numbers of unique motifs to search for (zero or one motifs per sequence) were specified in our queries. The software program MAST (http://meme.sdsc.edu/meme/ mast.html, version 3.0; non-commercial version) was used to search the Swiss-Prot database for other proteins displaying the motifs identified by MEME to be present in more than one dependence receptor. The algorithm in MAST calculates position scores for each profile at each possible position within a sequence. These scores are translated into p-values, which represent the likelihood of the given profile scoring that well against a randomly generated sequence. The best (i.e., lowest) position p-values for each profile are then adjusted to take into account the length of the sequence. MAST avoids allowing gaps in the profiles or in the search sequence.

Training Set
In order to search for motifs in previously described dependence receptors, we used a set of ten human receptors and their corresponding orthologues (three orthologues were used for each, in order to avoid bias generated by using more orthologues for one dependence receptor than another) found in the UniProtKB database. Thus, our training set included a total of 32 protein sequences (32 rather than 40, since the netrin receptors Unc5A, Unc5B, Unc5C were represented by a total of 4 sequences rather than 12, to prevent overweighting). This list is shown in Table 1.
The option of having the training set sequences ''shuffled'' provided one of the controls used, ensuring that the motif(s) we detected were significant.

Web Site
To maintain updated information on the dependence receptor field and to allow researchers to identify the DART motif in their protein of interest, we have developed a website (http://bis.ifc. unam.mx/DependenceReceptors/). The program at our website runs four independent predictions: three for identifying a transmembrane region and one for identifying the DART motif. The transmembrane region predictions are run through three different programs located at: 1) HMMTOP (http://www.enzim.hu/hmmtop/), 2) SOSUI (http://sosui.proteome.bio.tuat.ac.jp/sosui_submit. html) and 3) TMPRED (http://www.ch.embnet.org/software/ TMPRED_form.html).
For identifying the DART motif, the website uses the MAST program.

Searching for motifs among the known dependence receptors
The dependence receptors listed in Table 1 have all been shown to induce programmed cell death when expressed in the absence of their respective trophic ligands, but not when bound by these same trophic ligands. The receptors' non-orthologous sequences in the training set did not show any significant sequence similarity by simple alignment searches. Hence, in order to search for novel motifs in this set, we used the MEME program. MEME allows the identification of motifs in non-aligned sequences, where a motif is a sequence pattern that occurs repeatedly in a group of protein or DNA sequences (see Materials and Methods section). Although MEME has been most commonly used to identify motifs in homologous sequences, a pattern identified by MEME in nonhomologous sequences may be biologically relevant if: a) the proteins in the training set share a common function and b) the proteins identified to contain the motif could reasonably be suspected to share functional features with the training set. Note that these are the same conditions that are considered when evaluating motifs identified in homologous proteins.
Using a training set consisting of 32 sequences from the 10 experimentally-proven dependence receptors (Table 1), we identified a novel motif that occurs in all of the training set proteins. This motif, designated ''DART'' (dependence-associated receptor transmembrane) motif, appeared in the transmembrane region of all proteins in the training set that include a transmembrane region, whereas it appeared in the ligand-binding region of the one protein that lacks a transmembrane region (the androgen receptor).
The consensus sequence of the proposed DART domain is shown in Figure 1, and the DART motifs from the training set proteins are aligned in Figure 2.
Using the software program MAST to search the Swiss-Prot database for other proteins that displayed the DART motif, we found an additional 54 sequences, 16 of which are human proteins (using a cut-off at E-value of 2.7, the value below which all training set members scored) ( Table 2). Of 13,991 human proteins in the database, 3,465 are annotated as transmembrane proteins, and 25 display the DART motif-nine of the 10 training set members (the exception being, as noted above, the androgen receptor) and 16 additional human proteins (Table 3). Of these 16 additional proteins, all were transmembrane proteins, and all contained the DART motif within their transmembrane region. Thus the DART motif is relatively uncommon (at least as defined here), occurring in approximately 0.7% of human transmembrane proteins (25/ 3465). If we include slightly less similar motifs, extending the acceptable E-value from 2.7 to 10, then an additional 4 human proteins are included (data not shown). The alignment of the putative DART domains of these 16 human non-training set proteins is shown in Figure 3. A dendrogram of the human DART motifs is shown in Figure 4.

Structure of the putative DART domain
The predicted secondary structure of the consensus DART domain, as predicted by the SOPMA (Self Optimized Prediction Method from Alignments) method [9] is for a helical structure ( Figure 5). This is not surprising, given that the motif lies within the transmembrane region of the proteins that display it. However, in comparison to other transmembrane regions (in randomly selected transmembrane proteins) it is valine rich (2964% vs. 1563%; p,0.001).

DISCUSSION
The function of this novel motif is currently unknown. The finding that it exists in all dependence receptors described to date suggests that it may play a role in some biochemical process related to their function, such as the induction of apoptosis or the inhibition of apoptosis following ligand binding, or possibly an interaction with another membrane protein or membrane-associated non-proteinaceous molecule such as a lipid. For at least three of the proteins in the training set-APP, p75NTR, and DCC-this region undergoes regulated intramembrane proteolysis (RIP) [10], releasing an intracytoplasmic fragment that may migrate to the nucleus. Thus it is possible that the other proteins that display the DART motif may be substrates that also undergo regulated intramembrane proteolysis; however, a number of proteins that have been shown to undergo such cleavage do not display a DART motif, so it is clearly not required for such processing.
It is noteworthy that the transmembrane regions of DARTcontaining proteins are valine rich, with nearly twice the percentage of valine residues present in randomly-selected human Type I transmembrane protein domains (2964% vs. 1563%; p,0.001). It has been shown that Leu heptads within transmembrane domains may serve as homomultimerization domains and that the substitution of Ala (or other residues typical of transmembrane regions, including Val) for Leu may prevent homomultimerization [11]. Thus one possibility for the Val-rich nature of the DART domain may relate to the inhibition of receptor homomultimerization.

Proteins identified by MAST as displaying the DART motif
As noted above, MAST identified 16 proteins, all transmembrane proteins displaying the DART motifs and were as similar to the consensus as those of the training set (Table 3; Figure 3). Most of these have been implicated in cell death, either directly or indirectly; furthermore, several bind trophic ligands, as well, Vesicle-associated membra… The top-scoring non-training-set proteins displaying the DART motif, representing 38 proteins (plus 16 orthologues). Sixteen of the 54 are the human proteins listed in Table 3. doi:10.1371/journal.pone.0000463.t002 Neogenin has recently been shown to bind RGM (repulsive guidance molecule), and to serve as a possible dependence receptor for RGM, inducing programmed cell death that is inhibited by RGM [12]. Therefore, the identification of a DART motif within neogenin provides further support for RGM as a candidate dependence receptor.
APLP2 (APP-like protein 2) has been shown previously to be similar to APP in displaying a potential caspase-cleavage site in its intracytoplasmic domain [13]. Cleavage at this site liberates a proapoptotic peptide, C31, similar to what has been demonstrated for APP. Thus, although it is not yet clear whether APLP2 functions as a dependence receptor, and in particular whether APLP2 binds a trophic ligand, by analogy to APP it may bind laminin, collagen IV, glypican, or another ligand [14][15][16], and thus serve as a dependence receptor for one or more of those ligands.
Notch is an extensively-studied transmembrane receptor involved in cell fate determination. It binds to ligands Delta1, Jagged1, and Jagged2, regulating differentiation, proliferation, and apoptosis. Notch, like APP, DCC, and p75NTR, undergoes regulated intramembrane proteolysis, liberating an intracytoplasmic domain, the NICD, that forms a transcriptional activator complex with RBP-J kappa, activating genes of the enhancer of split locus.
Ephrin type B receptor 3 binds both ephrin-B1 and ephrin-B2. It is not yet known whether this receptor induces programmed cell death in the absence of ephrin-B1 or -B2 binding.
Tumor-associated calcium signal transducer 2 (TACD2) may function as a trophic factor receptor, but its ligand is currently unknown.
Three of the proteins identified by MAST as displaying a DART motif are involved in neurotransmitter synthesis or release. Catechol O-methyltransferase exists in both cytosolic and membrane-spanning (type II membrane protein) forms, and this latter displays a DART motif. Syntaxin-3 is a Type IV membrane protein potentially involved in docking of synaptic vesicles at presynaptic active zones. Vesicle-associated membrane protein 5

Conclusion
Ten of ten previously described dependence receptors display a region of similarity dubbed the DART (dependence-associated receptor transmembrane) motif. MAST identified this motif in an additional 16 human proteins in the SwissProt database, in all cases in the transmembrane regions. The function of this novel putative domain is unknown, but the motif is noted to be valine rich, and in at least four cases, the DART motif is a site of regulated intramembrane proteolysis (RIP). Whether or not this motif plays a functional role in cell death induction or ligandinduced inhibition mediated by dependence receptors remains to be determined, but the identification of this motif in 16 nontraining-set proteins such as Notch and APLP2 raises the question of whether these proteins may also function as dependence receptors. Since the field of dependence receptors is an emerging