Proteomics of Secretory and Endocytic Organelles in Giardia lamblia

Giardia lamblia is a flagellated protozoan enteroparasite transmitted as an environmentally resistant cyst. Trophozoites attach to the small intestine of vertebrate hosts and proliferate by binary fission. They access nutrients directly via uptake of bulk fluid phase material into specialized endocytic organelles termed peripheral vesicles (PVs), mainly on the exposed dorsal side. When trophozoites reach the G2/M restriction point in the cell cycle they can begin another round of cell division or encyst if they encounter specific environmental cues. They induce neogenesis of Golgi-like organelles, encystation-specific vesicles (ESVs), for regulated secretion of cyst wall material. PVs and ESVs are highly simplified and thus evolutionary diverged endocytic and exocytic organelle systems with key roles in proliferation and transmission to a new host, respectively. Both organelle systems physically and functionally intersect at the endoplasmic reticulum (ER) which has catabolic as well as anabolic functions. However, the unusually high degree of sequence divergence in Giardia rapidly exhausts phylogenomic strategies to identify and characterize the molecular underpinnings of these streamlined organelles. To define the first proteome of ESVs and PVs we used a novel strategy combining flow cytometry-based organelle sorting with in silico filtration of mass spectrometry data. From the limited size datasets we retrieved many hypothetical but also known organelle-specific factors. In contrast to PVs, ESVs appear to maintain a strong physical and functional link to the ER including recruitment of ribosomes to organelle membranes. Overall the data provide further evidence for the formation of a cyst extracellular matrix with minimal complexity. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the dataset identifier PXD000694.


Introduction
As the leading cause for protozoal diarrhea worldwide, the small intestinal parasite Giardia lamblia (syn. G. duodenalis, G. intestinalis) is an important pathogen of humans and animals causing significant morbidity and economic loss [1]. The Giardia life cycle is simple and consists of trophozoites, which multiply by binary fission in the gut of animal and human hosts, and an infectious cyst stage. Trophozoites attach actively to the epithelium of the small intestine and exhibit antigenic variation of variant surface proteins (VSPs) in their protein surface coat [2,3]. Triggered by environmental cues (e.g. bile concentration, bioavailability of lipids, pH) trophozoites undergo a complex stage-differentiation process and transform to environmentally resistant cyst forms. The complete life-cycle, including cyst formation and excystation, can be reproduced in vitro.
Giardia belongs to the phylum Diplomonadida, unicellular eukaryotes that have undergone considerable reductive evolution resulting in minimization or even loss of most cellular systems such as mitochondria, peroxisomes, a Golgi apparatus, and a classical endo-lysosomal system. Despite this unusual organization 3 giardial organelle systems are clearly discernible: the endplasmic reticulum (ER) which extends bilaterally through the cell body [4], relic mitochondria (mitosomes), localized at the cell center but also dispersed in the cytoplasm [5,6], and peripheral vesicles (PVs). PVs are ,150 nm compartments with a fixed localization underlying the plasma membrane on the dorsal side and are also present in a specialized region of the ventral disk [7,8]. PVs have been dubbed endosomal-lysosomal compartments based on localization of hydrolase activity [7,[9][10][11], their ability to acidify [10,12,13], and to take up exogenous ferritin as well as fluid phase markers [12,[14][15][16]. While in classical eukaryotic systems endosomes undergo organelle maturation, a similar process is not observed for PVs in Giardia. A selective pathway sorting proteins from the plasma membrane to PVs has been demonstrated [13,[17][18][19], and there is experimental evidence for a direct but selective connectivity between PVs and the ER [8]. PVs are thought to be the major route of nutrient uptake by the parasite, but the range of their functions, their morphogenesis and propagation remain unclear. Additional open questions concern the exact mechanism of bulk fluid uptake into the organelles and how endocytic cargo is sorted and trafficked to the ER.
Constitutive secretion of giardial proteins does not require a Golgi apparatus. As a consequence, secreted proteins are exported directly from the ER to target organelles such as PVs or the plasma membrane [20]. Organelles with Golgi properties (encystation-specific vesicles, ESVs) are generated de novo exclusively for regulated export of the cyst wall biopolymer consisting of three paralogous cyst wall proteins (CWP1-3) [21][22][23], and a unique b(1-3)-GalNAc homopolymer glycan [24,25]. ESV formation is induced by COPII-dependent export of cyst wall proteins from ER exit sites [26,27]. CWPs partition into two biophysically distinct phases before being sorted and secreted sequentially to build the two layers of the composite cyst wall polymer as an extracellular matrix 20-24 h post induction (p.i.) in vitro [28]. Although current data strongly support the hypothesis that ESVs are Golgi-derived organelles, the molecular underpinnings of ESV neogenesis and their identity as post-ER organelles remain controversial.
As with most systems and molecular machineries in diplomonads, PV and ESV organelles can be classified in terms of function but are highly divergent, i.e. there is some experimental evidence for their respective roles in the cell, but the paucity of molecular and morphological landmarks for organelle structure and function has prevented a systematic and detailed characterization. Nevertheless both organelle systems are essential and as such represent potentially vulnerable structures of this highly adapted parasite. The significant reductive evolution and sequence divergence in Giardia also means that strategies using homology-based identification and functional analysis of PV and ESV proteins [29][30][31] are biased towards few identifiable factors. The aim of this study was to identify novel factors that are specifically associated with these organelles and that may help characterize their nature, range of function, and evolutionary history in the context of the giardial ecological niche. To address these questions we developed a conceptually new approach to generate enriched organelle proteome datasets in two steps: (i) simultaneous flow cytometrybased sorting of a mixed microsome fraction containing green fluorescent ESV organelles with CWP3-GFP (green fluorescent protein) in condensed cores and red labeled peripheral vesicles (PV); (ii) mass spectrometry analysis and subtraction of overlapping hits to increase identification of organelle-specific candidates. Detailed analysis of the datasets and localization of selected candidate proteins suggests a close association of ESVs with the ER and no evidence for additional cargo or organelle-specific factors involved in their genesis and maturation. Conversely, although direct connections with the ER have been demonstrated, PVs appear to have a discrete compartment identity with a specific set of organelle proteins.

Results
Giardia has an extensive ER [4] which stretches almost throughout the entire cytoplasm and to the cell periphery. Giardial organelle preparations for proteomic analysis are uniformly contaminated with membrane and membrane associated ER proteins [32,33]. To overcome this we developed and implemented a new strategy to investigate the proteome of two organelle sets which are known to be in close proximity to and even in direct contact with the ER: ESV organelles [26,27] and PV organelles [8]. After cell disruption we simultaneously enriched differentially labeled ESV and PV organelles by fluorescenceassisted organelle sorting (FAOS) from a mixed microsome fraction. We reasoned that the simultaneously sorted vesicle fractions would contain comparable amounts of mostly cytoplasmic and ER-derived unspecific proteins, and that this overlap could be subtracted from the respective mass spectrometry (MS) datasets in silico to reveal organelle-specific proteins.

Enrichment and Separation of Fluorescently Labeled Organelles by Flow Cytometry
We used flow cytometry to sort differentially labeled ESVs and PVs simultaneously from a mixed microsome fraction. The organelle fraction for sorting was prepared by mixing two microsome fractions derived from trophozoites with labelled PVs (Dextran-AlexaFluor647, AF647) and transgenic encysting cells with labelled ESVs (CWP3-GFP). Cell labeling, harvesting, and disruption were performed in three completely independent experiments (biological replicates). The sorts were performed on a BD FACSAriaIII flow cytometer using a sort precision mode of 0/32/0 to obtain maximal purity. Three gates were set: a very broad parent gate P3 in the SSC (side scatter) vs. FSC (forward scatter) plot that excluded the readily apparent measurement noise, and gates P1 and P2 in a bivariate dot plot to define the GFP-positive and AF647-positive events, respectively ( Figure 1A). The target events in the mixed microsome fraction were 4.3% GFP-positives and 4.7% AF647-positives, corresponding to ESV and PV organelles, respectively.
Post-sort quality control by flow cytometry showed more than eight-fold increase of both, GFP-positive events (39.6%, Figure 1B, top) and AF647-positive events (42%, Figure 1B, bottom). In the post sort analysis 10 4 events of the AF647-enriched fraction contained no GFP-positive events, i.e. ESV organelles ( Figure 1B, bottom, gate P1). Likewise, analysis of 10 4 events of the GFPenriched fraction contained no AF647-positive events, i.e. PV organelles, with a scatter profile corresponding to that of the presort sample ( Figure 1B, top, gate P2); the cluster of events extending diagonally and between the GFP and AF647 fraction was apparent in all samples analyzed, both pre-and post-sort. Comparisons of normal buffer preparations (used across all experiments) with more meticulously prepared sample buffers and sheath fluid (manually filtered through liquid filters with a pore size of 220 nm) led us to conclude that these events do not correspond to ESV or PV target organelles, but instead predominantly represent particulates of very small size. Taken together, GFP-and AF647-positive events were completely separable, and we achieved a 100% relative enrichment using this approach.
To confirm the separation of labeled organelles, we detected GFP in post-sort precipitates ( Figure 1C) using Western blot analysis. Using an anti-GFP antibody, a band between 40 and 55 kDa corresponding to the predicted CWP3-GFP fusion protein (53 kDa) was detected in the ESV-enriched fraction but not in the PV-enriched fraction. An additional GFP-signal was detected in the high molecular weight area in the ESV-enriched fraction, presumably corresponding to insoluble CWP3-GFP aggregates from condensed cores of ESVs or covalently linked homomultimers. Taken together with the data obtained by post sort flow cytometry, we concluded that the two organelles could be quantitatively separated which was a prerequisite for the subsequent subtractive analysis.

Mass Spectrometry Analysis of Organelle-enriched Fractions
Sorted organelle-enriched fractions were analyzed by mass spectrometry using a shotgun approach combining 1D-SDS-PAGE and LC ESI-MS/MS (liquid chromatography electrospray ionization -tandem mass spectrometry). With a Mascot ion cut-off score of 20 for peptide-spectrum matches, a minimum of 2 unique peptides, and a protein probability of 80%, a total of 1281 proteins were identified in the combined triplicate ESV and PV fractions. After subtraction of environmental contaminations, e.g. keratins, 1213 G. lamblia proteins remained. A false discovery rate (FDR) of 0.0% for peptides and 0.5% for proteins was calculated by Scaffold. In the unified ESV fractions (E1+E2+E3) a total of 1129 proteins were identified; 750 (66%) thereof were detected in all three samples and 933 (83%) in at least 2 of 3 samples (Figure 2A, top). Comparable numbers were found for PV organelles: of 1140 proteins identified in total (P1+P2+P3), 708 (62%) were detected in all three samples and 923 (81%) in at least 2 of 3 samples (Figure 2A, bottom). The large overlaps of the P and E datasets, respectively, demonstrated high reproducibility between replicate experiments. For a detailed compilation of identified proteins see Table S1. All mass spectrometry proteomic datasets have been deposited to the ProteomeXchange consortium (http:// proteomecentral.proteomexchange.org) and are accessible with the dataset identifier PXD000694 and DOI 10.6019/ PXD000694.

Subtractive Analysis Eliminates Many Predominantly Translation-associated and ER-derived Contaminants
Simultaneous sorting of two differentially labeled organelles (ESVs and PVs) from a mixed microsome fraction allows subtracting the unspecific background of unlabeled soluble proteins as well as small cell debris contained within the positively sorted droplets which are generated by the vibrating nozzle of the cell sorter. The premise was that by eliminating all proteins common to the ESV and PV datasets (intersection) the organellespecificity of each dataset would increase significantly. In particular, the occurrence of ER-derived contaminants, which have been a severe problem in all previous attempts to enrich Giardia organelles, should be strongly reduced.
The total 1213 hits in all replicate mass spectrometry datasets contained 1059 putative contaminants, defined by the E1-E3 and P1-P3 data intersection, whilst 72 proteins were considered ESV-specific and 82 PV-specific ( Figure 2B, top). A detailed description of the workflow and the in silico identification of the data intersect can be found in the Figures S1 and S2. This relatively high ratio of contaminants to organelle-specific proteins was not surprising considering the results of previous cell fractionation experiments and the analysis of organelle fractions by SDS-PAGE in this study (not shown).

Analysis of the Eliminated ESV and PV Dataset Intersection
A more detailed analysis of the two organelle-specific and the large intersecting datasets was performed using the DAVID bioinformatics tool [34]. From 1059 proteins in the dataset intersection we removed an additional 28 with obsolete gene models. Of the 1031 remaining proteins 903 could be assigned to 14 different DAVID clusters (enrichment score .1), while 128 proteins could not be clustered. 4 of the 14 DAVID clusters showed an enrichment score of .3 ( Figure 2B, bottom). The top ranking clusters were ''ribosomal proteins'' (enrichment score 16.12, 701 proteins), ''protein biosynthesis'' (enrichment score 5.33, 305 proteins), ''chaperones/protein folding'' (enrichment score 3.38, 101 proteins) and ''carbohydrate metabolism'' (enrichment score 3.02, 176 proteins). A corresponding genomewide analysis as a reference using a total of 5150 validated genes yielded on average 79 clusters per 39000 gene models, with the vast majority of enrichment scores ,1 and none .2. See also supplementary data (Text S2 and S3) for a detailed description of all DAVID analyses results. Taken together, this demonstrates the relative enrichment for translation-associated and ER-derived factors in the data intersection, supporting the idea that subtraction of these hits will increase the specificity of organelle datasets. Conversely no obvious clusters were detected in the control datasets representing a genome-wide sampling.

Parsing and Manual Annotation of ESV and PV Organelle Specific Proteins
The large majority of gene models and protein annotations in the Giardia Genome Database (GiardiaDB) are based on automated predictions. For a manual annotation of organellespecific datasets we used function and homology prediction programs (PSORTII, TMHMM, SMART, pBLAST and HHPred) to identify putative signal peptides, transmembrane domains, protein functional domains and homologies (Table 1 and  Table S2). In total we suggest re-annotation of 29 genes (listed in Table S2). For parsing, 17 categories were used ( Figure 3) based on predicted function or localization according to the criteria described in this study or in previous reports [30].

Detection of Known Organelle Proteins in ESV and PV Datasets
Fewer than 20 factors are known to associate to ESVs at any given point during ESV neogenesis and maturation [10,16,26,29,32,[35][36][37][38][39]. Only 3 localize exclusively to ESVs at   13h post induction (p.i.): CWP1 and the large fragment of the proteolytically processed CWP2 in the fluid phase fraction, and CWP3 together with the small fragment of CWP2 in the condensed core [28]. CWP-derived tryptic peptides are detected very inefficiently in MS most likely due to the well-documented extensive intra-and intermolecular cross-linking by disulfide and isopeptide bonds [40,41]. The proportion of DTT (dithiothreitol)resistant high molecular weight complexes of CWP3-GFP can be estimated in the Western blot in Figure 1C (ESV). Nevertheless, peptides derived from the organelle marker CWP3-GFP or endogenous CWP3 were detected exclusively in ESV-enriched samples by MS (average quantitative value of 2.7) albeit only at a stringency of 1 unique peptide and 50% protein probability. GFPderived peptides were detected in all 3 ESV-enriched samples with an average quantitative value of 1.7.
Of the 72 proteins in the ESV dataset, 4 are either known or predicted to associate to ESV organelles. In particular, we identified two COPII-coat components Sec13 (Gl137698) and Erp3 (Gl15204) (trafficking proteins, Figure 3, Table 1). The association of other COPII-components with ESVs, such as GlSar1 and GlSec31, during early development was observed previously [26,29]. Additional ESV-associated factors are represented by 2 proteasome proteins ( Figure 3, Table 1). Proteasome complexes are recruited to ESV membranes during early encystation, perhaps in connection with post-ER quality control and associated degradation processes [32]. It is important to note that with the exception of CWP1-3, all other previously described ESV-associated factors are also expressed in trophozoites. Since their expression is not strictly stage-specific and they are recruited to ESV organelles from other subcellular localizations during encystation, their absence in the ESV-specific dataset but detection in the data intersection instead is not surprising.
The 15 PV proteins which have been identified in trophozoites and encysting cells thus far include soluble N-ethylmaleimidesensitive-factor attachment receptors (SNAREs), components of the clathrin/adaptor protein (AP)-mediated trafficking machinery, Rab11, an acidic phosphatase and an encystation-specific protease termed ESCP [13,[16][17][18]29,42,43]. Since most of these PVassociated proteins are involved in vesicle trafficking, they have secondary localizations, i.e. the cytoplasm, the plasma membrane, the ER or ESVs. Not surprisingly, we recovered the majority of these trafficking proteins in the data intersect. However, the Qa3-SNARE homolog syntaxin1 (Gl96994) [29,43] was specifically included in the PV dataset. In addition to this membrane trafficking factor we recovered 3 previously described PV-associated proteins including two VSPs (Gl33279, Gl13194) [44,45] and, when lowering the stringency to 1 unique peptide, the encystation-specific cysteine protease ESCP (Gl14566) [13].
Taken together, we detected 5 known ESV-associated proteins, including our organelle marker CWP3-GFP, in the ESV dataset, whereas 4 known PV-associated factors were contained in the PV dataset. Because of their extensive cross-linking and post translational modifications CWP1 and CWP2, comprising the fluid component of the cyst wall material (CWM) in ESVs at 13 hours p.i., were detected only with relaxed stringency.

Subcellular Localization of Candidate PV Proteins
As a first partial validation of the PV-derived dataset and to determine whether novel PV proteins are contained within the dataset, we expressed 8 proteins as C-terminally hemagglutinin (HA)-tagged variants in G. lamblia (Table 1, Figure 4b). In addition to proteins involved in membrane traffic, we focused on additional predicted transmembrane or membrane interacting proteins. We localized 3 tagged candidates involved in SNARE-mediated membrane fusion: (i) a Qa3-SNARE (Gl96994) homologous to syntaxin 1a [29,43] (ii) a putative Sec1 protein (Gl15104) which forms heterodimers with syntaxin 1a [46] and (iii) a putative alphasoluble N-ethylmaleimide-sensitive factor attachment protein (alpha-SNAP) (Gl16521). All 3 reporters localized predominantly to the cortical area of trophozoites, consistent with an association to PVs (Figure 5A-C and Figure S5). Our localization data of the Giardia Qa3-SNARE are in agreement with previous studies localizing this protein to PVs [43].
We localized two proteins with a predicted function in endocytic/endosomal transport: the adaptor protein (AP) large chain subunit BetaA (Gl15339) and a VPS (vacuolar protein sorting) 46a homolog (Gl15472), a putative component of a giardial endosomal sorting complex required for transport (ESCRT) III complex [47]. Localization of the tagged AP large chain subunit revealed a distribution consistent with the Giardia ER ( Figure 5D). This result is partially consistent with earlier studies detecting an AP1-subunit in PVs and the ER, and the AP2subunit u2 in PVs and the plasma membrane [17,18]. The tagged giardial VPS46a homolog was detected in the cytoplasm, and probably localizes also to the plasma membrane and/or to PVs ( Figure 5E). Subcellular localization studies on the vacuolar ATP synthase subunit (Gl8559, Figure 5F) and the multidrug resistance (MDR)like protein (Gl40224, Figure 5G) showed perinuclear staining and distribution typical of the Giardia ER. One hypothetical protein (Gl4270, Figure 5H) was detected in vesicle-like structures at the plasma membrane and in the cytoplasm. Whether these structures correspond indeed to PVs will require further investigation.

Subcellular Localization of Candidate ESV Proteins
To validate the ESV-organelle dataset, 16 proteins were chosen for ectopic expression as C-terminally HA-tagged variants in G. lamblia (Table 1, Figure 4a) based on the following criteria: (a) Giardia-specific hypothetical proteins since they are unique to Giardia and might associate with ESVs; (b) proteins whose mRNAs were significantly upregulated during encystation [48], representing putative key factors in differentiation; and (c) proteins with predicted transmembrane domains and/or signal peptides, as they may be trafficked to ESVs. In addition, we wanted to localize proteins involved in (d) sugar metabolism or (e) sugar transport as these may represent key factors required for cyst wall glycan synthesis. Another category (f) was composed of predicted transporters which may be involved in direct import of substrates into ESV organelles from the cytoplasm. We also determined the subcellular localization of (g) a newly identified calcium-binding protein. Finally, we selected (h) a signal recognition particle component to test for recruitment of complexes for co-translational insertion of proteins across the ER and ESV membranes, as suggested by the electron microscopy data (see below in ''Factors for co-translational protein insertion'').
Subcellular localization studies of 16 ESV candidate proteins revealed 11 candidates which had a distribution consistent with ER localization; 6 were detected exclusively in the ER while 5 also localized to ESVs. Four tagged candidates showed cytoplasmic localization and one candidate localized to the ventral disc (Table 1, Figure 4A).
A) Novel ESV proteins. Only two ER-resident proteins are known to reach ESVs without being secreted to the surface of the cell: heat shock protein (Hsp) 70/Binding protein (BiP), which cycles between the ER and ESVs [32] and a subtilisin-like proprotein convertase termed gSPC [39]. Here we identify 3 additional hypothetical proteins that localize to the ER and ESVs during encystation: (i) At 13 h p.i., the 202 amino acid-long HAtagged Gl14458 product was detected in the perinuclear ER, in ER-associated punctate structures reminiscent of ER exit sites (ERES), and overlapping with emerging ESVs in the cytoplasm ( Figure 6B). Transgenic cells expressing Gl14458-HA showed a significant delay or, in most cells, a block of CWP1 export from the ER and a reduction in the number of ESVs compared with wild type cells ( Figure 6A, B) or transgenic cells expressing unrelated HA-tagged products (data not shown). (ii) ORF Gl32419 is stagespecifically upregulated [49] and codes for a 564 amino acid protein of unknown function (Table S2) whose HA-tagged variant localizes in the ER, including structures reminiscent of ERES [27] as well as with maturing ESVs (Figure 6C, D). In cells with mature ESVs containing condensed cores the protein localized to ER membranes, in particular to those adjacent to ESVs ( Figure 6C-E). (iii) The predicted Gl25205 product is a hypothetical Giardiaspecific multipass membrane protein of 1246 amino acids with 14 hydrophobic domains. An epitope-tagged variant localized to the ER and in many cases also to morphologically normal ESVs at 13 hours p.i. (Figure 6F).
B) Factors involved in cyst wall glycan synthesis. ORF Gl15483 codes for an UDP-N-acetylglucosamine (UDP-GlcNAc) sugar transporter [50] with a reported localization at the perinuclear ER and at peripheral vesicles distinct from PVs [51]. This observation is only partially consistent with our localization of an HA-tagged variant in the ER but not to other organelles in encysting trophozoites at 13h p.i. (Figure 7A, 7B). In dual labeling experiments, signal overlap of 15483-HA and CWP1 in the perinuclear ER and in areas corresponding to punctate peripheral ER regions or emerging ESVs was observed ( Figure 7A, arrows). When mature ESVs were present, the overlap of the two proteins  was restricted mostly to the perinuclear ER ( Figure 7B). Notably, Gl15483-HA-expressing cells showed delay of CWP1 export from the ER and accumulation in the perinuclear ER at 13h p.i. (Figure 7A, 7B).
ORF Gl8382 codes for a putative UDP-GlcNAc-49-epimerase (GALE) homolog in Giardia with an N-terminal signal sequence. Manual protein sequence analysis revealed that the giardial protein harbors all conserved motifs required for the enzyme's function [52] (Figure S3). An HA-tagged variant localized to the ER ( Figure 7C). These transgenic cells retained CWP1 in the ER at 13h p.i. and no mature ESVs were formed. The correlation, if any, between this phenotype and the predicted function of the protein, i.e. conversion of UDP-GlcNAc into the UDP-Nacetylgalactosamine (UDP-GalNAc) monomer of the cyst wall glycan [53] in the ER lumen, remains to be determined.
C) Factors for co-translational protein insertion. The panel of candidate ESV proteins contains a substantial number (8,11.1%) of ribosomal or ribosome-associated proteins ( Figure 3, Table 1) whereas the PV dataset only contained 2 of these potential contaminants. Interestingly, in transmission electron microscopy (TEM) images we found that, in addition to decorating rough ER membranes, ribosomes distinctly associated with ESV membranes ( Figure 8A, inset). Taken together with the identification of a component of the signal recognition particle, SRP54 (Gl15156) in the ESV dataset, this suggests that ESV cargo proteins could be inserted co-translationally directly into the ESV lumen. Since transgenic cells expressing epitope-tagged ribosomal subunits were not viable, we tested this hypothesis indirectly by expression of an epitope-tagged SRP54 variant ( Table 1). The tagged product localized in a punctate, distributed pattern and signal overlap with CWP1 was detected predominantly in smaller ESVs ( Figure 8B, arrows). The possibility that proteins, presumably CWPs, are co-translationally inserted directly into ESVs is intriguing but requires additional experimental testing.
In summary (see Table 1) our localization studies uncovered 5 ESV-associated proteins including 3 Giardia-specific proteins (Gl32419, Gl14458, Gl25205), the signal recognition particle subunit SRP54 (Gl15156), and an UDP-GlcNAc transporter (Gl15483). Six tagged candidates localized exclusively to the ER. Among these were an UDP-GlcNAc-49-epimerase (GALE) eventually involved in synthesis of the CW glycan, two hypothetical proteins (Gl10221, Gl22136), a predicted synaptic protein SC2 (Gl88581) and a WD40 repeat protein (Gl15956) both of unknown function, and an amino acid transporter (Gl11299). Four tagged candidates showed cytoplasmic localization: two Giardia-specific hypothetical proteins (Gl7350, Gl9157), a calcium-binding protein (Gl7207), and a glycosyl transferase family 8 protein (Gl11595). Finally, a Giardia-specific protein of unknown function (Gl87926) was detected at the ventral disc. Taken together, this preliminary analysis suggests that i) abundant novel ESV cargo proteins are not present in mature ESVs, and ii) while large amounts of ER-derived contaminants can be eliminated by subtraction, many ER proteins remain in the dataset. The most likely explanation for this is the intimate physical contact and direct connections between the ER and ESVs as illustrated in Figure 8A, which survive cell disruption and organelle preparation.

Discussion
ESV organelles are inducible Golgi-like membrane compartments for accumulation, processing, sorting, and export of the Giardia cyst wall material during differentiation of trophozoites into cysts [26,28,29]. Their de novo genesis and maturation to secretion-competent organelles is only partially understood: Fewer than 20 ESV-associated factors (among them the 3 CWPs) have been identified or characterized [10,16,26,29,32,[35][36][37][38][39]. However, no defining ESV-specific, peripherally associated or membrane-bound factor has been identified. Previous attempts to generate an ESV proteome using cell fractionation and density gradient centrifugation yielded datasets which revealed additional ESV factors but also contained very high levels of contaminating proteins [32] (Hehl A.B., unpublished). Here, we tested a conceptually new approach to generate highly enriched organelle proteome datasets for ESVs and PVs resulting in identification of a candidate set of 72 ESV-and 82 PV-associated proteins.

The ER and ESVs Maintain a Broad Range of Interactions
Partial validation of the ''ESV-specific'' protein dataset (72 candidates) by subcellular localization studies of 16 selected ESV candidate proteins revealed 5 proteins which localized to ESVs, but also many ER proteins: 11 candidates had a distribution consistent with ER localization; 6 thereof were detected exclusively in the ER. This suggests that elements of ER organelles remain physically linked to and are co-sorted with labeled ESVs, but not PVs.
ESVs are closely associated with the ER during their neogenesis and nucleated from ERES in a COPII-dependent process [26,27]. Although light microscopy data suggests that ESVs become physically distinct from the ER after neogenesis [27,28] previously published electron microscopy (EM) data [20,42,54] and data in  Table 1 and Table S2. ER: endoplasmic reticulum; ESVs: encystation-specific vesicles; CYT: cytoplasm; VD: ventral disc; PVs: peripheral vesicles; PM; plasma membrane. doi:10.1371/journal.pone.0094089.g004 the present study ( Figure 8A) clearly show membrane continuities between the organelle systems. The imaging data is supported by evidence for cycling of ER-resident proteins such as Hsp70/BiP between the two organelles [32]. An extensive network of tubular membrane connections mediating exchange of CWP1 between maturing ESVs makes a definition of the boundaries for ESVs even more challenging [26]: there is a possibility that this dynamic network is not restricted to ESV organelles but establishes direct connections with the ER [28]. Further, the recruitment of proteasomes [32] and ribosomes to ER and ESV membranes, as shown also in this study, is an additional common feature of the two organelles. This suggests that basic trafficking-related processes such as co-translational import of secreted proteins, folding, retro-translocation, and associated degradation processes start at the ER level but may extend beyond ESV genesis.
Consistent with the premise of subtractive elimination of ER proteins from organelle-specific factors, the MS data intersect was highly enriched in abundant ER-resident proteins, e.g. protein disulfide isomerase (PDI) 1-5, Hsp70/BiP and Hsp90/Grp94. On the other hand, many proteins in the ESV dataset showed a typical ER distribution, although 5 proteins also localized to ESVs. A likely explanation is that many tested candidates are present in ER subdomains closely associated or directly connected to ESVs. When encysting Giardia cells are subjected to cell disruption by sonication or other tested methods such as nitrogen cavitation (not shown), these ER domains remain physically linked to ESVs and are enriched accordingly in the sorting process.
Despite elimination of ,90% of all hits in the dataset intersection, which was enriched for known ER proteins, the ESV dataset still contained a large number of proteins localizing to the ER. There are several possible explanations for this surprising finding, the most likely being the intimate association between the ER and ESVs as discussed above. The physical connections and membrane continuities may be resistant to our cell disruption protocols and may thus limit to extraction of ESVs from this subcellular context. In contrast, ER membranes and PV do not occupy the same cellular space: the ER network extends throughout the cytoplasm but not into the cortical layer just below the plasma membrane into which PVs are embedded [55]. Although the interface between the two organelle systems includes postulated direct contact points [8] the subcellular segregation of the two compartments appears to make separation by cell disruption easier.
We performed a preliminary validation of the ''ESV-specific'' dataset by localizing 16 tagged factors. The results suggested that the dataset contained many ER proteins rather than novel ESV factors. Based on this we propose two non-mutually exclusive interpretations for the lack of novel ESV-specific proteins in our datasets: i) The organelle enrichment strategy and post-sorting elimination of contaminants worked well for filtering out abundant generic ER proteins, but is less efficient in eliminating minor contaminants and/or proteins that remain attached to the fluorescently labeled organelles. ii) ESVs and the ER are highly distinct with respect to abundant luminal proteins such as CWPs, modifying factors, or chaperones. However, both organelles share most membrane or peripherally associated proteins. All available data on stage-specific gene expression during encystation suggests that only CWP genes are strictly stage-specifically regulated (i.e. completely ''off'' in trophozoites) whilst the remaining (,20) significantly modulated ''encystation'' genes are upregulated from a basal level in trophozoites during the first 7 hours p.i. [48]. Since CWPs as cargo proteins are certainly more abundant than ESV membrane or organelle-associated proteins, any novel representatives of the latter are difficult to identify within a still relatively high background of ER proteins. In fact, the apparent lack of abundant novel proteins in ESVs exacerbates the appearance of false-positive ER factors in the organelle-specific dataset.

A Proposal for Cargo-driven ESV Neogenesis
The emerging picture in Giardia encystation is that, aside from synthesis of the bulk cyst wall proteins, only relatively small adjustments of expression in some genes (e.g. enzymes for the synthesis of the cyst wall glycan) are required for encysting cells to produce mature ESVs [48,56]. This also suggests that morphogenesis of ESV organelles could be driven by accumulation of cargo rather than by specific organelle-associated factors, analogous to the formation of dense core secretory granules (DCSG) in endocrine/neuroendocrine cells [57], or in ciliates [58]. In early electron microscopy studies, ESV formation was described as aggregation of electron-dense material in ER membrane-bounded compartments, followed by growth via direct addition of newly synthesized CWPs until large organelles are formed [21,54,59]. A more recent model posits that ESV formation is the result of selforganizing properties, mainly of CWP3, leading to formation of a dense core [28]. The ability of CWP1 and 2 to form highly Figure 6. Subcellular localization of three ESV candidates: Gl14458HA, Gl32419HA, and Gl25205HA. Representative localization of Cterminally HA-tagged variants after inducible (Gl14458, Gl32419) or constitutive expression (Gl25205) by confocal microscopy at 13h p.i. Anti-CWP1 was used to detect ESV organelles. A) Wild type (WB) encysting trophozoite at 13h p.i.: CWP1 localized to ''doughnut-shaped'' ESVs which is typical for this time point [28]. No CWP1-signal in the perinuclear ER was visible. B) Gl14458HA at 13h p.i.: Gl14458HA was detected primarily in the perinuclear ER and overlaps with CWP1 in ESVs. Note the retention of CWP1 in the perinuclear ER and reduction of ESV numbers. In fact, the majority of induced cells in the population did not produce ESVs at all. C, D, E) Representative subcellular localizations of Gl32419-HA at 13h p.i. C) Partial signal overlap of Gl32419HA (green) with CWP1 (red) in knob-like structures, reminiscent of ESV neogenesis at ER exit sites [27]. Alternatively, Gl32419HA localized to the ER D) co-localizing with CWP1 primarily in the perinuclear ER and in ESVs suggesting delayed export of CWM to ESVs. E) In cells with canonical mature ESVs no signal overlap of Gl32419HA and CWP1 was observed. F) Gl25205HA at 13h p.i. was detected in the ER and in ESVs with occasional signal overlap with CWP1. Antibodies: anti-HA high affinity from rat, Alexa488-conjugated goat anti-rat (green), and Texas redconjugated anti-CWP1 (red). Nuclear DNA was labeled with DAPI. pCWP1: inducible CWP1 promoter; pendo: endogenous promoter; int: stable integration into the genome; epi: episomal maintenance of the plasmid, WB: wildtype. Scale bar: 1.5 mm. doi:10.1371/journal.pone.0094089.g006 cross-linked complexes [21,41], and of CWP1 to bind directly to the GalNAc homopolymers [60], which constitutes 60% of the cyst wall material, provides the prerequisites for distributing the extracellular matrix material evenly on the surface of the parasite before initiating polymerization. The presence of basic and simple machinery for ESV formation is supported by the observation that the expression of CWP1 and CWP2 in human embryonic kidney cells is sufficient to induce accumulation in membrane compartments and secretion of the proteins [61]. This observation is in line with granule formation in non-granule forming cells upon expression of different dense core granule cargo proteins, including pro-vasopressin, chromogranin A or von Willebrand factor [62][63][64], suggesting that the cargo proteins themselves induce the formation of their own carriers through accumulation. Thus, ESV formation might be driven by progressive accumulation of CWPs by a ''sorting by retention'' mechanism, while ER-resident proteins such as Hsp70/BiP are removed via low-density vesicles or tubular connections between ER and ESVs as the organelles mature [32]. Taken together, the proteome data support a scenario for ESV formation and maturation which relies strongly on inherent properties of cargo proteins and likely only few and as yet unidentified additional components, rather than on a dedicated ESV-specific, organelle-associated machinery driving morphogenesis.
Analysis of the remaining candidates in the data set might bring to light further proteins localizing to ESV organelles. However, none of these genes appear to be significantly upregulated during encystation [48]. Thus, unless significant translational control comes into play [36] we do not expect any additional highly abundant proteins to be discovered exclusively in ESVs.
Taken together, identification of one or more low abundance protein(s) which could be used as defining factors for ESVs as post ER organelles within the only regulated secretory transport pathway in Giardia, remains a significant challenge.

Ribosomal Proteins Are Enriched in the ESV Fraction
A comparison of the 72 ESV-specific and 82 PV-specific candidates revealed a significant enrichment of translation/ ribosome proteins in the former. Using transmission electron and immunofluorescence microscopy we found support for recruitment of ribosomes not only to the ER but also to ESV membranes during the differentiation process. However, additional functional verification is required, in particular to test whether co-translational insertion of proteins directed to the regulated secretory pathway may occur directly into ESVs.
In eukaryotic cells, ribosomes localize to the cytoplasm, the nuclear envelope, and the rough ER, giving the latter its typical appearance in transmission electron microscopy. While an Representative subcellular localization of C-terminally HA-tagged variants of UDP-GlcNAc transporter (Gl15483) and the putative UDP-GlcNAc-49epimerase (Gl8382) in encysting transgenic cells. A, B) At 13h p.i.: the HA-tagged transporter was detected in the ER. Co-localization with CWP1 was observed in the perinuclear ER and in areas corresponding to distinct ER regions or early ESVs (A, arrows). CWP1 is delayed in the perinuclear ER of Gl15483HA-expressing cells where it overlaps with Gl15483HA. In mature ESVs, no co-localization of the two proteins was observed. C) Localization of the putative UDP-GlcNAc-49-epimerase (Gl8382HA) at 13h p.i in the ER together with CWP1 whose export is delayed. Antibodies: anti-HA high affinity from rat, Alexa488-conjugated goat anti-rat (green), and Texas red-conjugated anti-CWP1 (red). Nuclear DNA was labeled with DAPI. pCWP1: inducible CWP1 promoter; int: stable integration into the genome; scale bar: 1.5 mm. doi:10.1371/journal.pone.0094089.g007 association of ribosomes to Golgi membranes in eukaryotic cells containing a steady state Golgi organelle was not observed, giardial ribosomes can be visualized on ESV membranes. Cotranslational insertion of secreted proteins directly across ESV membranes might be a consequence of the requirement for producing large quantities of CWP in the relatively short time when ESVs grow maximally. The process of CWP synthesis, translocation and folding clearly begins at the ER level from which CWPs are exported in a COPII-dependent manner [27]). However, the amount of CWPs detected in the ER drops significantly after establishment of small immature ESVs ( Figure 5 in [27]). Direct co-translational import of CWPs via ESV membranes is one possible explanation for this observation. Directed translocation of proteins across ESV membranes towards the cytosol as part of a quality control system was inferred from the observation that proteasome complexes were recruited to the vicinity of developing ESV organelles [26]. Pore complexes such as Sec61, for which an alpha and a gamma subunit are annotated in the Giardia genome database, are required for co-translational insertion and are also strongly implicated in retro-translocation to the cytoplasm [65]. In support of co-translational insertion of proteins across ESV membranes, a signal recognition particle component, 54 kDa protein (Gl15156), has been localized partially to ESVs ( Figure 8B). However, none of these factors are distributed in an organelle-specific manner and there is currently no possibility to design experiments allowing a dissection of the directionality of protein translocation across ESV or ER membranes.

Is the Cyst Wall Sugar Monomer UDP-GalNAc Synthesized in the ER?
The Giardia cyst wall consists of 3 proteins (CWP1-3) and a b(1-3)-GalNAc homopolymer which makes up about 60% of the cyst wall [24,25]. While the protein components are trafficked via ESV organelles to the surface of the cell, the place of synthesis, transport to the surface, as well as timing and manner of incorporation of the sugar components into the cyst wall remains largely unknown. The sparse literature on the subject suggests synthesis of the cyst wall monomer UDP-GalNAc from endogenous glucose by a series of stage-specifically regulated, enzymatic reactions [53]. A late step, i.e. conversion of UDP-GlcNAc into UDP-GalNAc, was proposed to be performed by a cytosolic UDP-GlcNAc-49-epimerase or so called GALE (Gl7982). While some experimental data showed that the enzyme converted UDP-GlcNAc into UDP-GalNAc during encystation, investigation of enzyme kinetics showed that the reverse reaction towards production of UDP-GlcNAc was clearly favored, raising significant doubts about a productive synthesis of UDP-GalNAc in the cytoplasm [66].
In this study, two important proteins potentially involved in this process were identified in the ESV dataset: i) the only nucleotide sugar transporter (Gl15483) identified in the Giardia genome project [50] which specifically transports UDP-GlcNAc from the cytoplasm to the ER lumen [51], and ii) a putative UDP-GlcNAc-49-epimerase (GALE) (Gl8382). We detected epitope-tagged variants of the epimerase in the ER, and the transporter mainly showed distribution in the perinuclear ER and early ESVs, where its signal overlapped with that of CWP1. N-glycosylation of Recruitment of ribosomes to ER membranes (tubular structures) and to early ESVs (round, electron-dense structures) is observed. Ribosomes are visible as small, round and highly electron-dense structures arrayed along the cytoplasmic side of ESV and ER membranes. B) Immunofluorescence analysis of cells expressing a C-terminally HA-tagged signal recognition particle component SRP54 (line pendo-15156HA-epi). The micrograph shows punctuate localization of SRP54-HA and partial overlap with CWP1 accumulation (arrows). Cytoplasmic and ER membrane associated SRP54-HA generates a high background signal, making a detection of the protein at ESV membranes difficult. Antibodies: anti-HA high affinity from rat, Alexa488-conjugated goat anti-rat (green), and Texas red-conjugated anti-CWP1 (red). Nuclei were labeled with DAPI. pCWP1: inducible CWP1 promoter; epi: episomal maintenance of expression vector; scale bar: 1.5 mm. doi:10.1371/journal.pone.0094089.g008 Giardia proteins is restricted to addition of GlcNAc 1-2 to asparagine [67]. Consistent with this, the parasite lacks genes required for synthesis of the typical eukaryotic core-oligosaccharide GlcNAc 2 Man 9 Glc 3 and for further N-glycan processing in the ER and Golgi. While the UDP-GlcNAc-transporter Gl15483 in the ER membrane imports UDP-GlcNAc used for N-glycosylation, the presence of an ER-localized UDP-GlcNAc-49-epimerase converting UDP-GlcNAc into UDP-GalNAc indicates involvement of the putative GALE enzyme (Gl8382) in producing the UDP-GalNAc monomer for the cyst wall glycan in the ER.

Opportunities and Technical Limitations of Dual Organelle Sorting and in Silico Processing of Mass Spectrometry Data
In proteomic studies, the purity of the biological sample is of utmost importance for a successful analysis. One of the most crucial steps is subcellular fractionation. Despite considerable efforts to optimize protocols for purification of Giardia organelles, the levels of contaminating proteins from non-target organelles and cellular structures remain high [32,33]. The most frequently used subcellular fractionation techniques applied in organellar proteomics are density-based gradient centrifugation, affinitybased isolation, free flow electrophoresis, and recently also flow cytometry [68][69][70][71]. Fluorescence-based organelle sorting by flow cytometry is challenging because of the small size of organelles which usually results in reduced fluorescence intensity. In the case of Giardia organelles, labeling with a highly expressed luminal GFP-tagged organelle marker (CWP3-GFP) in ESVs or the endocytic uptake of a fluid-phase fluorescent dye by PV organelles created unique opportunities for intense organelle labeling. This was sufficient to clearly detect and enrich the organelles simultaneously by flow cytometry despite their small size and, most importantly, to achieve a 100% relative enrichment (i.e. 100% separation) of ESVs and PVs which was a precondition for the subsequent subtractive approach. Analysis of ESV and PVenriched samples by shotgun mass spectrometry revealed high reproducibility between 3 independent experiments.
The limited purity of organelle preparations and the high sensitivity of current mass spectrometers require additional measures to address the large quantities of false-positive hits. Researchers have developed different (in silico) strategies for the elimination of contaminating proteins from organellar datasets. A proteomic study on Giardia mitochondria-relic organelles (mitosomes) using gradient centrifugation took advantage of the organelle distribution into two neighboring fractions [33]. Using isobaric tags for relative and absolute quantitation (iTRAQ) mass spectrometry the relative distribution of mitosomal marker proteins between the two fractions was evaluated, and novel putative proteins were identified based on their similar distribution ratio [33]. Another study focusing on the proteome of Spironucleus hydrogenosomes utilized the distribution of the target organelle into two gradient fractions of distinct densities [72]. After mass spectrometry analysis, putative organelle-specific proteins were identified by their co-purification with organelle marker proteins. Both approaches significantly reduced the incidence of contaminants in the resulting large datasets, but the proportion of organelle-specific proteins remained low.
In our case, simultaneous enrichment of ESV and PV organelles makes subtractive approaches to identify contaminating MS hits in silico possible. Using this approach we removed 1059 hits which were common to both organelle fractions, including a large proportion (86%) of ribosome/translation and ER-derived contaminants which constitute a major challenge in organelle proteomic studies. Accordingly, among the 72 ESV and 82 PV candidates only a single predicted ER protein was identified in each dataset. However, preliminary evaluation revealed a large proportion of hitherto unknown ER proteins.
Two factors may contribute to the low discovery rate of organelle proteins: i) peripherally associated organelle proteins may be partially or completely lost during the purification and sorting process, ii) the subtractive approach to remove contaminating proteins likely also eliminates specific categories of organelle-associated proteins, i.e. those with secondary localizations or with large cytoplasmic pools. Examples are the small GTPases Rab1, Rab11, Arf1 and COPI components which also localize to the ER and other membranes. In addition, some peripherally associated factors are recruited to ESVs only during specific phases of the differentiation process, e.g. Rab1, COPI [26,29], or members of the SNARE family [43]. In summary, the success of a simultaneous sorting approach strongly depends on how cleanly the differentially labeled organelles can be prepared during cell disruption and how similar their cellular context is. The major limitation in our case is the close association of ESVs but not PVs with the ER subdomains. Since the cellular context of the two analyzed organelles differs greatly, subtractive analysis appears to be more efficient for PV candidates but still led to many false-positive candidates in the ESV dataset. Thus, subtractive analysis of datasets derived from simultaneously sorted organelles is a useful strategy to discover organelle-specific factors, but the degree of success depends strongly on the feasibility of clean extraction of target organelles from their subcellular context.

Giardia Cell Culture and in Vitro Encystation
Giardia lamblia WBC6 (ATCC catalog number 50803) trophozoites were grown under microaerophilic conditions in 11 ml culture tubes (Nunc, cat. 156758) or triple flasks (Nunc, cat. 132867) containing TYI-S-33 medium supplemented with 10% fetal bovine serum and bovine bile according to standard protocols [41]. Parasites were harvested by chilling the tubes on ice for 30 minutes (for flasks: 1 hour in ice water) to detach adherent cells and collected by centrifugation (9006g, 10 minutes, 4uC). Encystation was induced using the two-step method as described previously [48] by cultivating the trophozoites in bile-free medium for 44 hours and thereafter in medium (pH 7.85) containing porcine bile.

Sample Preparation for Flow Cytometry
ESV-organelle staining. CWP3-GFP expressing cells [28] and WB wild type cells (control) were grown in triple flasks (26800 ml) and kept in encystation conditions for 13 hours as described above. Cells were harvested and resuspended in 20 ml of encystation medium. To allow oxidative chromophore formation without damage to the cells, the cell suspension was chilled on ice and dispersed into a 6 well plate (Sigma, cat. Z707759) and exposed to air on ice over night. To complete GFP folding, the icecold cell suspension was collected from the plate and incubated in microaerophilic conditions for 30 minutes at 37uC.
PV-organelle labeling. Wild type trophozoites were grown in triple flasks and harvested as described above. Cells were washed twice in 10 ml 1x PBS (9006g, 10 minutes, 4uC) and resuspended in 500 ul supplemented PBS (5 mM glucose, 5 mM cysteine, 0.1 mM ascorbic acid) containing 4 mg/ml dextran AlexaFluor-647 (Molecular Probes Inc., cat. D22914). Endocytic uptake of the fluorescent dye by PV organelles was achieved at 37uC for 30 minutes, protected from light.
All samples were washed twice in 10 ml 1x PBS (9006g, 10 minutes, 4uC) and resuspended in 5 ml 1x PBS. After addition of protease inhibitor cocktail (Calbiochem, cat. 539131) and phenylmethanesulfonyl fluoride (PMSF, Sigma, cat. P7626), the cells were disrupted by four rounds of mild sonication (Branson Sonifier 250, Branson Ultrasonics Corporation, 4660 pulses, duty cycle 20%, output control 1.5) on ice. To remove remaining intact cells and cysts, the cell suspensions were passed through a 5 um filter (MILLEX-SV 5.00 um, Millipore, cat. SLSV25LS). Prior to the sort, the two cell lysates were mixed at a ratio of 1:4 to obtain similar numbers of target events, i.e. GFP-positive and AF647positive events, in the mixture.

Flow Cytometry-based Organelle Sorting
Flow cytometry-based sorting of organelles was performed on a BD FACSAriaIII TM cell sorter. For data acquisition and processing, the BD FACSDiva TM software (version 6.1.3) was used. In order to achieve maximal speed the sort was performed with using a nozzle with a 70 micrometer orifice diameter at 4.83 bar sheath pressure. GFP was excited by a 488 nm laser and emission was detected using a 525/45 band pass filter. Alexa-Fluor647 was excited by a 633 nm laser and emission was detected using a 670/40 band pass filter. Given the small organelle size, considerable proportion of observed events was stemming from particulate and electronic noise. All the fluorescent and light scatter parameters were estimated by the height of the voltage pulse generated by each event. The detection threshold was defined as a logical combination of green (GFP) and red (AF647) signal value using the ''OR'' functional operator. The organelle populations were defined by a parent gate (P3) based on FSC-H (forward scatter-height) and SSC-H (side scatter-height) (Figure 1). Yellow fluorescent protein (YFP) control beads (SHERO TM Fluorescent Nanospheres, Spherotech Inc., cat. FP-0552-2) in the size range of the organelles (400 to 600 nm) were used to estimate the level of particulate and/or electronic noise with selected instrument settings. To select for GFP-positive and AF647-positive events out of the mixed organelle population, gates P1 and P2 were set in a bivariate dot-plot. An unlabeled cell suspension was used as negative control to define the gate positions. To attain maximal purity, a sort precision mode of 0/ 32/0 was chosen. The sort was performed with an average event rate of 259000 to 309000 per second, an average sort rate of 600 events per second and a mean sort efficiency of 80%. Twenty million GFP-positive and AF647-positive events were collected in separate 5 ml polystyrene tubes (BD Biosciences, cat. 352052). For quality control of the sort, the collected material was analyzed by flow cytometry using the same settings as for the sort (Figure 1).

Protein Precipitation
Protein precipitation was performed using the pyrogallol red molybdate (PRM) method [73]. Briefly, PRM reagent was added to the sample in a ratio of 1:4. The samples were mixed and incubated at 25uC for 25 minutes. Proteins were pelleted at 38006g for 30 minutes and dissolved in 1 ml of ddH2O. After addition of 250 ul of PRM reagent, incubation and pelleting was repeated one more time. The final pellet was resuspended in 25 ul of Laemmli buffer containing 0.5% (v/v) b-mercaptoethanol, incubated for 5 minutes in boiling water, and stored over night at 220uC.

SDS-PAGE and Immunoblot Analysis
SDS-PAGE on a 12% polyacrylamide gel and subsequent transfer to a nitrocellulose membrane (Protran, Whatman GmbH, cat. 10401396) was performed according to standard protocols.

Mass Spectrometry Analysis and Protein Identification
Protein samples were boiled for 5 minutes and centrifuged for 1 minute at 1691006g at room temperature to pellet insoluble material. The samples were separated by 1D-SDS-PAGE using precast 12% Tris-Glycine gels (Invitrogen, cat. IM6000). 4 to 8 ul of supernatant were loaded, depending on the estimated amount of protein in the respective sample determined in a preceding test run. After staining with Roti Blue (CARL ROTH, cat. A152), each gel line was cut into 21 slices. In-gel digestion of proteins using trypsin and extraction of peptides was performed according to standard protocols. Samples were analyzed on a LTQ-Orbitrap XL mass spectrometer (Thermo Fischer Scientific, Bremen, Germany) coupled to an Eksigent-Nano-HPLC system (Eksigent Technologies, Dublin, CA, USA). A detailed description of sample preparation and mass spectrometry analysis can be found in the Text S1.
The raw-files from the mass spectrometer were converted into Mascot generic files (mgf) with Mascot Distiller software 2.4.2.0 (Matrix Science Ltd., London, UK). The peak lists were searched using Mascot Server 2.3 against the G. lamblia database (http:// tinyurl.com/37z5zqp) with a concatenated decoy database supplemented with contaminants, The Arabidopsis Information Resource (TAIR9) protein database and the Swissprot database to increase the database's size. The final database included 79141 entries. The identification results were loaded into Scaffold 3.0 (Proteome Software, Portland, US) and filtered for a minimal mascot score of 20 for peptide probability, a protein probability greater than 80%, and a minimum of 2 unique peptides per protein. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://www. proteomexchange.org) via the PRIDE partner repository [74] with the dataset identifier PXD000694''.

In silico Removal and Functional Annotation Clustering of the Intersection Dataset
The final ESV and PV organelle-specific datasets were compiled by in silico identification and removal of the contaminating proteins. Briefly, the ESV-derived and PV-derived mass spectrometry (MS)-datasets from the three independent experiments were intersected separately. Proteins detected in both ESV and PV MS-datasets were considered as ''contaminants'' and removed, thus generating three independent subtractive lists enriched for putative ESV-specific hits and PV-specific hits, respectively. To enhance the stringency for detection of putative organelle-specific candidates, we accepted only ESV candidates that were detected in at least two more ESV MS-datasets than PV MS-datasets, and vice-versa. A detailed description of the procedure is attached Figure S3. The contaminating proteins defined by the data intersection were evaluated and clustered into functional groups using the DAVID bioinformatics tool (http:// david.abcc.ncifcrf.gov/home.jsp) [34]. Since analysis in DAVID is restricted to 3000 genes, clustering of the Giardia genome as a control was performed by the generation of 10 independent lists each containing 3000 randomly selected Giardia genes.

In silico Analysis Tools
Analysis of primary structure and domain architecture of ESV and PV candidates (i.e., manual annotation) was performed using the following tools and databases: PSORTII (http://psort.hgc.jp/ form2.html) for prediction of subcellular localization, TMHMM (http://www.cbs.dtu.dk/services/TMHMM/) for prediction of transmembrane helices, SMART (http://smart.embl-heidelberg. de/) for prediction of patterns and functional domains, pBLAST for protein homology detection (protein blast by NCBI, http:// blast.ncbi.nlm.nih.gov/Blast.cgi), HHPred (http://toolkit. tuebingen.mpg.de/hhpred) for protein homology detection based on Hidden Markov Model (HMM-HMM) comparison, and the Giardia genome database (http://giardiadb.org/giardiadb/) for changes in mRNA expression during the Giardia life cycle. For functional domains predicted by SMART we used an e-value of 10e-5 as cutoff, and for protein homologies predicted by pBLAST we accepted alignment scores above 80. Alignment scores between 50 and 80 were accepted only when the pBLAST predictions were consistent with those of HHPred. The latter was used to make pBLAST more robust; only hits with a probability above 95% were accepted.

Expression Constructs and Transfection
For cloning of C-terminally HA-tagged proteins in Giardia, a vector PAC-CHA with additional restriction sites was designed on the basis of the previously described vector pPacV-Integ [26]. Additional restriction sites were inserted via oligonucleotide primers. A detailed vector map can be found in the Figure S4. For each gene of interest two expression vectors were constructed, one in which expression of the gene of interest is driven by its own promoter (pendo), and another in which the gene of interest is under the control of the inducible cyst wall protein 1 promoter (pCWP1). GenBank accession numbers and a list of primers used for cloning can be found in Tables S2 and S3, respectively. For transfection, 15 ug of plasmid DNA linearized with SwaI was electroporated (BIO RAD Gene Pulser, 350V, 960 mF, 800 Ohm). The expression vector is targeted to the Giardia lamblia triose phosphate isomerase (Gl-TPI) locus by homologous recombination [75] stable transfectants are selected with the antibiotic puromycin (Sigma, cat. 7699111) at a concentration of 77 uM for 5 days. For episomal maintenance, circular plasmid DNA was electroporated and selected with puromycin.

Transmission Electron Microscopy
Transmission electron microscopy and sample preparation was performed as described previously [29]. Figure S1 Workflow. CWP3-GFP expressing cells at 13 hours p.i. (A) and wild type trophozoites after endocytic uptake of the fluid phase dye Dextran-AlexaFluor-647 (B) were disrupted by sonication and passed through a 5mm filter. The cleared microsome fractions were mixed (C) and organelles were simultaneously enriched by flow cytometry-assisted organelle sorting (FAOS) (D). Sample preparation and organelle sorting were performed in biological triplicates. Protein precipitates of organelle-enriched fractions were separated by 1D-SDS-PAGE and analyzed by mass spectrometry (MS) (E), resulting in 3 ESV and PV mass spectrometry datasets, each (F). Contaminating proteins were identified by intersecting the ESV and PV MSdatasets (G). A detailed description of the intersection can be found in Figure S2. In silico data filtration, i.e. removal of the data intersection (H) revealed ESV-organelle (J) and PV-organelle (K) specific datasets. (TIF) Figure S2 Generation of the MS data intersection. A) Mass spectrometry datasets of ESV-enriched (E1, E2, E3) and PVenriched (P1, P2, P3) fractions of each replicate were intersected separately (top). The numbers stand for the proteins detected by mass spectrometry. Removal of the intersection revealed proteins exclusively detected in ESV-enriched fractions (middle, left) or PVenriched fractions (middle, right). From these lists, only proteins occurring in at least two lists were accepted (bottom, blue). The proteins were further analyzed according to their distribution pattern in the six organelle-enriched fractions (B). B) Schematic representation of the protein distribution pattern in ESV-and PVenriched fractions. ESV candidates (left): proteins of type X were detected exclusively and in at least two of three ESV fractions, proteins of type Y were detected in all ESV fractions and in one PV fraction, proteins of type Z were detected in only two ESV fractions and in one PV fraction. The same is true vice-versa for PV candidates (right). Type Z proteins were removed, resulting in 72 ESV and 82 PV candidate proteins. The respective protein numbers are indicated in brackets. (TIF) Figure S3 Conserved short chain dehydrogenases (SDH) motifs in Gl8382, Gl7982 and human GALE. Protein sequences of G. lamblia Gl7982 (cytoplasmic GALE, [53]), G. lamblia Gl8382 (putative ER-GALE), and the human GALE (hGALE) were analyzed manually. All conserved sequences required for hGALE function [52] are present in both Giardia GALEs and listed in the table. A conserved PG motif, which is required for the direction of the reaction, is only present in the Giardia ER-GALE. An N-terminal integral membrane domain in the ER-GALE shifts the conserved motif positions for about 40 amino acids towards the C-terminus, compared to the cytoplasmic GALE and hGALE.  Table S2 ESV and PV candidate list with additional information. For each of the 72 ESV and 82 PV candidates, the following information is provided: protein category (column B), GeneID (column C) and product description (column E) according to the G. lamblia genome database, NCBI reference number (column D), manual re-annotation (column F) and the prediction tools it is based on (column G), number of transmembrane domains (column H), signal peptide (column I), significant stagespecific up-regulation of transcription (column J), localization of HA-tagged variants determined in this study (column K), and literature information (column L). TMD: Transmembrane domains; SP: Signal peptide; ER: Endoplasmic Reticulum; ESV: Encystation specific vesicles; PV: Peripheral vesicles; CYT: Cytosol; PM: Plasma membrane; asterisk: prediction tools return different results. (XLSX) Text S1 Detailed description of mass spectrometry analysis. Detailed description of SDS-PAGE, sample preparation, mass spectrometry analysis, database search and protein identification. (DOC)

Supporting Information
Text S2 DAVID functional annotation clustering of data intersection. Functional annotation clustering of the data intersection using the DAVID bioinformatics tool (http://david. abcc.ncifcrf.gov/home.jsp). For each cluster, the enrichment score is given at the top, and functional groups (left) and protein counts (right) within the respective cluster are listed. The 1059 putative contaminants of the data intersection are categorized into 56 annotation clusters. (PDF) Text S3 DAVID functional annotation clustering of the G. lamblia genome. Functional annotation clustering of the G. lamblia genome using the DAVID bioinformatics tool (http:// david.abcc.ncifcrf.gov/home.jsp). For each cluster, the enrichment score is given at the top, and functional groups (left) and protein counts (right) within the respective cluster are listed. Clustering of all predicted proteins in the G. lamblia genome (5150 validated genes) using 10 random gene lists containing 39000 genes each. (PDF)