Similarities of Drosophila rab GTPases Based on Expression Profiling: Completion and Analysis of the rab-Gal4 Kit

We recently generated rab-Gal4 lines for 25 of 29 predicted Drosophila rab GTPases. These lines provide tools for the expression of reporters, mutant rab variants or other genes, under control of the regulatory elements of individual rab loci. Here, we report the generation and characterization of the remaining four rab-Gal4 lines. Based on the completed ‘rab-Gal4 kit’ we performed a comparative analysis of the cellular and subcellular expression of all rab GTPases. This analysis includes the cellular expression patterns in characterized neuronal and non-neuronal cells and tissues, the subcellular localization of wild type, constitutively active and dominant negative rab GTPases and colocalization with known intracellular compartment markers. Our comparative analysis identifies all Rab GTPases that are expressed in the same cells and localize to the same intracellular compartments. Remarkably, similarities based on these criteria are typically not predicted by primary sequence homology. Hence, our findings provide an alternative basis to assess potential roles and redundancies based on expression in developing and adult cell types, compartment identity and subcellular localization.

The human genome contains at least 60 and maybe more than 70 rab genes [14,15,16]. The Drosophila genome contains 33 potential rab GTPase loci based on primary sequence homology, 23 of which have direct orthologs in humans with at least 50% protein similarity [15,16,17,18]. Four of the 33 loci are 99% identical to recent evolutionary duplications in a cluster of six potential rab loci in a small interval on the X chromosome at cytological location 9C-F [19], leading us to predict a total of 29 potential rab genes in Drosophila [17]. We have recently performed a systematic profiling effort for 25 of these loci [17]. The two other conserved loci in this X chromosomal cluster (RabX2 and RabX3) were the only predicted rab genes for which we found no expression [17]. Hence, the total number of functional rab loci in Drosophila may only be 27. We have previously characterized 23 of these 27 through the analysis of rab-Gal4 driver lines [17]. The Gal4/UAS system is the most widely used binary expression system in Drosophila [20,21]. We used recombineering to precisely insert the Gal4 open reading frame into the start codon site of each rab GTPase within a large (20-50 kb) genomic fragment [17,22]. The large genomic fragments are predicted to preserve all regulatory elements, thus yielding Gal4 lines that can be used to drive fluorescent reporters or fluorescently tagged variants of the Rabs themselves as wild type, constitutively active or dominant negative proteins. Several of the original 23 rab-Gal4 lines were verified using antibodies, proteins traps or rescue experiments [17].
Here, we report cellular and subcellular expression patterns of the four remaining rab-Gal4 lines, namely rab30, rab40, rabX5 and rabX6. All four are novel rab GTPases of largely unknown function. In agreement with our recent findings that up to half of all rabs are either neuron-specific or highly enriched in neurons, we found that rabX5-Gal4 and rabX6-Gal4 are novel neuron/glia-specific Gal4 lines, whereas rab30-Gal4 expresses ubiquitously and rab40 only very weakly. In addition to obtaining cellular expression data, we analyzed subcellular localization by expressing YFP-tagged wild type, constitutively active (GTP-bound) and dominant negative (GDP-bound) YFP-tagged Rab proteins under their own regula-tory elements. Finally, we performed a preliminary characterization of the subcellular compartments marked by these four novel Rab proteins.
The completion of the 'rab-Gal4 kit' makes it possible to perform a comprehensive comparison of cellular and subcellular localization features of all Drosophila Rabs. Homology is an important indicator for potential redundancies, especially in a gene family with a common ancestor. However, in order to have the potential of a redundant function in vivo, the proteins should be expressed in the same cell at the same time. In the case of Rab GTPases, localization to the same intracellular compartment is a further likely prerequisite Figure 1. Targeting vector design for rab30-Gal4, rab40-Gal4, rabX5-Gal4 and rabX6-Gal4. 20-22 kb genomic regions (black bars) were recombineered from bacterial artificial chromosomes (BACs) into attB-P[acman]-KO [17,22]. Regions of a few kb are shown at higher resolution to reveal the structures of rab loci within these genomic regions. Sequences between red arrows were replaces with a Gal4 knock-in cassette [17]. For expression analyses, transgenic flies with the targeting vectors inserted in the same landing site were used. doi:10.1371/journal.pone.0040912.g001 Table 1. Similarities of rab-Gal4 expression patterns based on expression patterns. Pair-wise analyses for all 27 rabs were performed, and their similarity was determined as described in Materials and Methods. Bold rabs indicate the closest homologous Rabs in protein sequence alignments based on Figure 1 in [18] and Figure S3 in [17]. In cases of two or three pone.0040912.g006.tifclose homologs. doi:10.1371/journal.pone.0040912.t002   for redundancy. With this idea in mind, we present here a comparative analysis of the 27 predicted fly rab GTPases for 33 criteria that include expression in specific tissues or cells and subcellular localization of the wild type, dominant negative and constitutively active proteins. Our findings indicate that protein sequence similarity in many cases poorly predicts which Rabs share common expression and localization patterns. These analyses will serve as a guide to assess which rabs carry out specific functions based on their cellular and subcellular localization.

Molecular Biology, Recombineering, and Drosophila Genetics
We previously generated 50-55 kb targeting vectors for rab30, rab40, rabX5 and rabX6, but failed to obtain transformants after injection of more than 1,500 embryos each [17]. For the generation of new targeting vectors we chose smaller genomic regions which include sequences 15 kb upstream and 5 kb downstream of the rab loci (Fig. 1). In addition, we applied small improvements to the recombineering protocol and verification of the final targeting cassette. These modifications include PCR and sequencing verifications for the precision of the Gal4 knock-ins as described recently [22]. Finally, transformation efficiency is greatly enhanced if the DNA of the large vectors is 'maxi'-prepped at the place of injection, i.e. without excessive handling or shipping, and injected with minimal delay time.
Complete open reading frames (ORFs) were replaced as before for rab30, rabX5 and rabX6. In contrast to these three rab loci, rab40 contains long introns. We therefore replaced only the short coding regions starting with the ATG to the end of the ATG-containing exon (Fig. 1). All vectors were verified by sequencing. Transgenic fly strains were established using standard procedures at Rainbow Transgenics, Inc. All vectors were inserted in the same landing site attP-3B (Bloomington Stock #24871) to generate the rab-Gal4 transgenic flies. The new rab-Gal4 lines were crossed to UAS-CD8-GFP as well as the respective UAS-YFP-Rabs (wild type, constitutively active and dominant negative) precisely as in the original study [17,18]. All flies were kept at 25 C.

Immunohistochemistry, Microscopy, and Image Processing
Larval brains and tissues, pupal brains and adult brains were dissected and prepared for confocal microscopy as previously reported [23]. The tissues were fixed in phosphate buffered saline (PBS) with 3.5% formaldehyde for 15 min and washed in PBS with 0.4% Triton X-100. High-resolution light microscopy was performed using a Confocal Microscope (Leica SP5). Imaging data was processed and quantified using Amira 5.2 (Indeed, Berlin, Germany) and Adobe Photoshop CS4 as described in [24]. The following antibodies were used at 1:500: rabbit anti-rab5, rabbit anti-rab7 [25], mouse anti-rab11. A mouse monoclonal antibody against CSP was used at 1:50.

Pair-wise Similarity Analyses
The presence or absence of expression or colocalization was determined manually in high-resolution 3D confocal datasets. For the pair-wise comparisons the data was binarized, i.e. any level of expression or colocalization was counted as 1, each absence as 0. Each pair of the 27 rabs was separately compared for expression in 25 cell types or brain structures (Table 1) as well as for eight subcellular localization criteria ( Table 2).
Similarities between two rabs were calculated separately for the 25 cellular and 8 subcellular criteria. Only criteria in which at least one rab was positive were considered. Hence, a '1' for both rabs was counted as a similarity, a '1' and a '0' as a discrepancy and a '0' for both was disregarded. This latter rule prevents a scenario where two rabs that have no expression or colocalization in common might otherwise appear similar solely based on common absence of expression of colocalization. Similarity Sim rab for two rabs, rabA and rabB, was therefore calculated as follows: Sim rab~X n k~1 rabA(k)|rabB(k) ½ =(n{r) V n=r with n = total number of criteria (25 for cellular expression in Table 1 and eight for subcellular criteria in Table 2); rabA(k) and rabB(k) = binary value of presence (1) or absence (0) of criterion

Completion of the 'rab-Gal4 kit'
We recently presented a first systematic effort towards a functional characterization of all rab GTPases in Drosophila [17]. We developed a streamlined cloning strategy for the generation of rab-Gal4 lines as versatile tools that can be used to express any gene under control of the endogenous regulatory elements of a particular rab locus [17,22,26]. In particular, the availability of a complementary kit of UAS-YFP-Rab lines in combination with the rab-Gal4 lines offers the opportunity to express wild type (WT), constitutively active (CA, GTP-bound) and dominant negative (DN, GDP-bound) Rabs under their own regulatory elements in wild type or mutant backgrounds [17,18]. The cloning strategy underlying the generation of the rab-Gal4 lines is based on P[acman] technology, an implementation of bacterial artificial chromosome (BAC) recombineering in Drosophila [27,28]. We inserted Gal4 cassettes into large genomic fragments (20-55 kb) that are predicted to contain all regulatory elements of individual rab loci in order to ensure faithful replication of the endogenous expression patterns. However, the transformation of these large vectors proved difficult in individual cases. Out of 29 rab loci, we originally failed to obtain transformants for four: rab30, rab40, rabX5 and rabX6. These problems were likely related to these particular genomic sequences; however, we cannot exclude other issues with the original 50-55 kb transformation vectors, since we did not sequence them in their entirety. Since the publication of the first 'rab-Gal4 kit', we have improved all steps of the technology including the verification of the correct recombineering products, the transformation and the possibility to mobilize the targeting cassette from the original landing site to generate a knock-in in the endogenous locus [22]. Some modifications are very simple, but drastically improve specific steps, e.g. the avoidance of excessive handling and time delays between DNA preparation and injection for transformation.
The objective of the present study was to complete the 'rab-Gal4 kit' and thereby be in a position to perform a comprehensive comparison of all Rabs with respect to expression pattern, subcellular localization, intracellular compartment identity and localization behavior as WT, CA and DN proteins. We generated new Gal4 vectors for rab30, rab40, rabX5 and rabX6 using smaller genomic fragments as shown in Fig. 1. Specifically, we reduced the 59 genomic region to 15 kb and the 39 genomic sequence to 5 kb and generated transgenic flies as described [17,22].

Cellular Expression Profiling of the New rab-Gal4 Lines
To determine the cellular expression pattern of these rab-Gal4 lines, we crossed them to UAS-CD8-GFP and obtained highresolution 3D confocal datasets for the L3 larval brain, eye disc, wing disc, leg disc and salivary gland as well as P+30% pupal brains and adult brains (Fig. 2). rab30-Gal4 expresses ubiquitously in all or at least in most cell types. However, as observed for several other rab-Gal4 lines, expression levels of rab30-Gal4 vary strongly in different cell types, more so than other evenly expressing ubiquitous lines such as rab5-Gal4 and rab11-Gal4 [17]. In contrast to rab30, rab40-Gal4 expresses at very low levels and mostly below the detection limit in the imaginal discs, salivary glands as well as pupal and adult brains ( Fig. 2A, B). rabX5-Gal4 also shows weak but specific expression in some neurons of the ventral ganglion in the larval brain ( Fig. 2A). Finally, rabX6-Gal4 exhibits strong expression in the larval brain and neurons innervating the leg disc. Expression in non-neuronal tissues like the wing disc or salivary gland was not observed. In the eye disc and pupal brain, glial expression is most pronounced. Expression of rabX6-Gal4 in the adult brain is weaker and more-sparse compared to rab30-Gal4 and again strongest in glial cells (Fig. 2B). In summary, the new rab-Gal4 lines corroborate our previous observation of highly variable rab expression levels, especially in the nervous system [17].

Subcellular Localization of Rab30, Rab40, RabX5 and RabX6 in Neurons
Next, we investigated the subcellular localization of YFP-Rab30, YFP-Rab40, YFP-RabX5 and YFP-RabX6 expressed by their respective rab-Gal4 lines in neurons of the larval ventral ganglion. With respect to cell body or synaptic localization Rab30 is localized in both, but stronger at synapses; Rab40 is at low levels present in both; RabX5 is specific to the synaptic region of the ventral ganglion; RabX6 is mostly in the cell bodies and to a lesser extent at synapses (Fig. 3A). Taken together with all other Rabs, RabX6 is the only neuronal/glial Rab that predominantly localizes to cell bodies and not to synapses.
To reveal the identities of subcellular compartments marked by YFP-Rab30, YFP-Rab40, YFP-RabX5, and YFP-RabX6, we colabeled the larval brain preparations with antibodies that mark early endosomes (Rab5), late endosomes (Rab7), recycling endosomes (Rab11) and synaptic vesicles, (Cysteine-String Protein, CSP). In contrast to our previous analyses of 23 YFP-Rab proteins, none of these novel Rabs strongly colocalize with any of the markers. RabX5 and RabX6 in particular label clear subcellular structures that are not positively labeled by any of the four antibodies. YFP-Rab40 levels may have been too low for a decisive analysis. Only Rab30 showed weak and partial colocalization with both Rab11 and CSP (Fig. 3A).
Rab GTPases cycle between GTP-bound and GDP-bound forms. A complete set of constitutively active (CA) and dominant negative (DN) UAS-YFP-Rab lines has previously been generated [18]. We performed functional studies with these by again expressing each YFP-Rab protein under control of their own regulatory elements with the respective rab-Gal4 line. As shown in Fig. 3B, Rab30 exhibits the typical and most commonly previously observed behavior of a more diffuse, cell body biased localization of the DN, whereas the WT and CA variants mark distinct structures especially at synapses. RabX6 has a similar behavior, except that WT and CA variants mark more distinct compartments in the cell bodies that are lost with the DN variant. The Rab40 CA and DN variants were too weak to be scored with confidence. RabX5 exhibited an unusual behavior, where the DN variant exhibits increased synaptic compartments (Fig. 3B). In summary, none of the four new Rabs exhibit cellular or subcellular localization profiles that are identical to any of the previously characterized 23 Rabs.

Similarities of rab-Gal4 expression patterns based on expression patterns
With the complete profiling dataset for all Drosophila rab GTPases in hand, we are in a position to compare the cellular and subcellular expression data for all Rabs. In order to characterize similarities in cellular expression patterns, we identified 25 clearly discernible cell types and tissues for an assessment of the presence or absence of expression (Fig. 4). These cell types and tissues include non-neuronal developing imaginal discs, as well as neuronal and glial cell types and prominent brain structures like the mushroom bodies in the larval, pupal and adult brain. In all cases we analyzed the original high-resolution 3D confocal microscopy datasets of the rab genes, 23 of which were previously only qualitatively assessed [17].
To evaluate overall expression similarities, we performed pairwise comparisons for all possible pairs of the 27 rabs. Absence or presence of expression was scored in a binary manner irrespective of the qualitatively different strengths of expression denoted in Fig. 4. Common presence of expression in a cell or tissue was counted towards similarity, whereas common absence was not counted. For details see Materials and Methods. The results of this binary analysis are summarized in Table 1.
The most obvious class of similarity comprises ubiquitously expressed rabs, including rab1, rab2, rab5, rab6, rab8, rab11, rab35 and rab39. A closely related second group of rabs comprises some potentially ubiquitous lines with larger expression variability, including rab4, rab7, rab10, rab14 and rab18. Similarities between the neuron-specific or neuronally enriched lines are less obvious. This is consistent with our previous observation that the more selectively neuronally and glia-expressing lines exhibit considerable differences of their expression patterns in the brain. Indeed, only two rabs identified in the combined studies are expressed panneuronally, namely rab3 and rabX4. In contrast, eight of the original 23 rabs are neuron-and glia-specific or strongly enriched, but they are expressed in strikingly different patterns in the brain, namely rab9, rab19, rab21, rab23, rab26, rab27, rab32 and rabX1. Two of the four novel rabs added in the present study, rabX5 and rabX6, fall into this category. In summary, of 27 rab GTPases that exhibit clear expression in the tissues analyzed here, 12 are neuron-specific or neuron-enriched; two of these are expressed pan-neuronally, and ten express in varying and surprisingly specific patterns in neurons and glial cells in the brain. The comparisons of the precise expression patterns reveal similarities that allow us to test for potential redundancies of these neuronal rabs not only within that group, but also with more widely expressed rabs that overlap in the same cell types.

Similarities of Rab GTPases Based on Subcellular Localization Features
The comparison of expression patterns is not useful to identify potentially similar rab GTPases that are ubiquitously expressed. Similarly, the analysis is limited in identifying similarities amongst the differently expressed neuronal rabs. We therefore chose an independent set of more specific Rab protein and subcellular localization features for the second part of our similarity analysis. These criteria include synaptic and cell body localization, colocalization with compartments positive for Rab5, Rab7, Rab11 or CSP, and finally compartment discernibility as DN or CA variant. In all cases YFP-Rab proteins were expressed under control of their respective rab-Gal4 lines and analyzed in the larval ventral ganglion. A complete assessment of all 27 YFP-Rab proteins is shown in Fig. 5. Next, we performed pair-wise comparisons for binary datasets using the same rules as applied for the cellular expression data. The resulting similarities, shown in Table 2, are in many ways revealing. Several pairs exhibit 100% overlapping subcellular localization features, despite divergent expression patterns. For example, the two synaptic vesicleassociated Rab3 and Rab27 represent such a case. Indeed, both were previously shown to exhibit partial functional redundancy in secretion [29]. Moreover, several synaptic Rab11-associated Rabs, including Rab19, Rab21, and RabX4, are 100% identical for the subcellular features analyzed here. It is tempting to speculate that these Rabs may exert partially redundant functions at synapses. Several other groups await experimental verification. For example, Rab1 exhibits identical subcellular localization features to Rab6 and RabX6, even though Rab6 is mostly expressed in glial cells, as shown in this study. Similarly, RabX1 exhibits similarity to Rab1, with RabX1 restricted to neurons and Rab11 being ubiquitous.
Lastly, we compared similarities based on cellular expression patterns and subcellular localization features with primary sequence homology. In other words, we asked whether the closest rab homologs would also exhibit the most similar cellular and subcellular localization patterns. We highlighted the closest rab homologs in Tables 1 and 2. Interestingly, we found only few correlations between protein similarity and expression patterns or subcellular localization. While there are several cases where two of three criteria correlate, there is no case where all three correlate. For example, Rab3 and Rab27 are the only example that represents a pair of closest homologs that also exhibit identical subcellular localization features, but they have strikingly different expression patterns. Rab1 and Rab35 are close homologs that are both ubiquitously expressed, but these exhibit strikingly different subcellular localization features. Rab1 and Rab6 exhibit identical subcellular localization features and are both ubiquitous, but they are far apart on the phylogenetic tree of Drosophila rab GTPases [18]. These findings suggest that an assessment of similar functions and potential redundancies in a gene family like the rab GTPases may be incomplete if solely based on protein sequence homology. Our data further make numerous predictions about the potential functional properties of Rabs in multicellular eukaryotes that now await experimental verification.

Discussion
In this paper, we present the completion and expression analysis of the rab-Gal4 kit. We identified two novel neuronal rab GTPases (rabX5 and rabX6) and one ubiquitous rab (rab30), in line with our previous report that more than one third of Drosophila Rab GTPases are enriched or even specific to neurons and glia.
With the complete cellular and subcellular profiling data in hand, we could for the first time perform a systematic comparison of all Drosophila Rab GTPases. A key finding of this analysis shows that protein homology, expression pattern and subcellular localization in many cases exhibit revealing correlations for two of these criteria, but never for all three. In other words, we found no two Rabs that are closely related, expressed in the same pattern and mark the same subcellular compartment. This analysis may therefore provide a meaningful measure of Rab GTPase functional diversity.
Expression patterns are unlikely to correlate with protein sequence similarities, because expression is determined by regulator regions outside of the coding region. In contrast, the subcellular localization and association with compartments as a function of GTP/GDP-binding are directly related to protein functions [15,30], yet we observed few correlations. A possible explanation for this could be that protein domains that determine the association with, for example, a distinct endosomal compartment are only short and not visible in the homology comparison over the complete protein lengths. Importantly, the cellular and subcellular localization data analyzed here provide direct experimental evidence for which rab GTPases potentially reside on similar compartments in the same cells at the same time -all likely requirements for potential redundancy. In contrast, the primary protein sequence is in many cases only a partial or no reliable predictor for protein structure. In this sense, the analyses presented here represent an opportunity where comprehensive subcellular localization data is available to assess the reliability of redundancies predicted by sequence homology. The 25 cell types and tissues used for our expression analysis are not representative or comprehensive, but chosen only for discernability in the binary analysis. Hence, a similarity score of 80% based on a score of '20' cannot be compared as an absolute number, but only relative to the same criteria for other rabs. Neither cellular expression nor the subcellular localization criteria are sufficient to assess potential redundancy. For example, both rab3 and rabX4 are identically pan-neuronally expressed, but Rab3 marks synaptic vesicles whereas RabX4 marks Rab11-positive compartments. Conversely, rab21 and rabX4 have substantially different expression patterns, yet when they overlap in the nervous system they exhibit the identical subcellular localization profile. Hence, these two rab GTPases are potential candidates for similar or redundant functions in these cells only. More generally, in the pair-wise comparison a rab GTPase with restricted expression (e.g. a neuron-specific rab) receives a low score when compared to a rab GTPase with broader expression (e.g. a ubiquitous rab), and hence will be categorized as less similar. However, this lower score does not correlate with the probability of redundancy in the cell types where the two rab GTPases are actually co-expressed. We therefore regard the combination of cellular and subcellular profile similarities as a means to restrict the number of potentially redundant rab GTPases. Importantly, all our rab-Gal4 lines represent targeting vectors for the generation of molecularly defined mutants through ends-out homologous recombination, as demonstrated in our original studies [17,22]. Hence, the completed rab-Gal4 kit provides all necessary tools to experimentally test functional predictions from our analyses, as well as experiments using double and triple mutants to verify such functional relationships.