Global Gene Expression Profiling through the Complete Life Cycle of Trypanosoma vivax

The parasitic flagellate Trypanosoma vivax is a cause of animal trypanosomiasis across Africa and South America. The parasite has a digenetic life cycle, passing between mammalian hosts and insect vectors, and a series of developmental forms adapted to each life cycle stage. Each point in the life cycle presents radically different challenges to parasite metabolism and physiology and distinct host interactions requiring remodeling of the parasite cell surface. Transcriptomic and proteomic studies of the related parasites T. brucei and T. congolense have shown how gene expression is regulated during their development. New methods for in vitro culture of the T. vivax insect stages have allowed us to describe global gene expression throughout the complete T. vivax life cycle for the first time. We combined transcriptomic and proteomic analysis of each life stage using RNA-seq and mass spectrometry respectively, to identify genes with patterns of preferential transcription or expression. While T. vivax conforms to a pattern of highly conserved gene expression found in other African trypanosomes, (e.g. developmental regulation of energy metabolism, restricted expression of a dominant variant antigen, and expression of ‘Fam50’ proteins in the insect mouthparts), we identified significant differences in gene expression affecting metabolism in the fly and a suite of T. vivax-specific genes with predicted cell-surface expression that are preferentially expressed in the mammal (‘Fam29, 30, 42’) or the vector (‘Fam34, 35, 43’). T. vivax differs significantly from other African trypanosomes in the developmentally-regulated proteins likely to be expressed on its cell surface and thus, in the structure of the host-parasite interface. These unique features may yet explain the species differences in life cycle and could, in the form of bloodstream-stage proteins that do not undergo antigenic variation, provide targets for therapy.


Introduction
African trypanosomes are unicellular vector-borne hemoparasites of humans, domestic livestock and wild animals. They cause African trypanosomiasis, an endemic disease of sub-Saharan Africa otherwise known as sleeping sickness in humans and nagana in animals, and are transmitted between vertebrate hosts by the bite of tsetse flies (Glossina spp.). This endemic disease causes considerable morbidity in livestock herds and associated losses in animal productivity. The threat of Animal African trypanosomiasis in tsetse-infested areas also prevents effective exploitation of available pasture, thereby impeding economic development in the world's poorest nations.
There are several species of African trypanosome that vary in life cycle, host range and pathology. Trypanosoma brucei is predominantly an animal pathogen that has evolved the ability to infect humans on multiple occasions [1], while T. congolense and T. vivax are exclusively animal pathogens. During their life cycles, T. brucei and T. congolense exist as procyclic forms in the mid-gut of the tsetse fly before migrating into the salivary glands and proventriculus respectively, where they develop into epimastigotes and then metacyclic trypomastigotes that are able to infect vertebrates (see Fig 1). In contrast, T. vivax lacks a procyclic stage in the insect mid-gut and has no complex migration within the insect; rather, T. vivax develops directly into epimastigote forms within the insect proboscis [2] (Fig 1). This difference might explain why T. vivax can be transmitted by other kinds of biting insect [3][4] and has therefore spread beyond the sub-Saharan distribution of the tsetse fly into northern Africa and South America [5][6].
There are compelling reasons for supposing that gene expression in T. vivax will be different to T. brucei in important ways, not least due to differences in life cycle development (Fig 1), but also because the T. vivax genome contains substantially different repertoires of VSG and BARP-like genes (and no procyclin at all), as well as numerous gene families that appear to be unique [26]. As in vitro cultivation of insect stages has not previously been possible, gene expression in T. vivax has only been analyzed in the bloodstream form, and then only through transcriptomic analysis [14]. Moreover, given that gene regulation is achieved largely through (2) migration of parasites to the insect mid-gut with differentiation into procyclic forms (T. vivax lacks this stage); (3) migration anteriorly to the proboscis (T. vivax), proventriculus (T. congolense) or salivary gland (T. brucei) and differentiation into epimastigote forms; and (4) differentiation into metacyclic forms and inoculation into the vertebrate host upon insect feeding.
post-transcriptional modifications in trypanosomes (reviewed in [27]), differences between transcript and peptide abundances across the life cycle are expected. We recently established in vitro cultures of the insect stages of T. vivax [28], and so a comparison of gene expression across African trypanosome species is now possible.
Using transcriptome sequencing and proteomics, we have analyzed differences in gene expression between T. vivax epimastigote, metacyclic and bloodstream forms. Our results show that the numerous T. vivax-specific genes predicted to function on the parasite cell surface are transcribed and often developmentally regulated. Genome-wide patterns of developmental regulation are conserved across African trypanosome species, with some notable exceptions concerning pyruvate metabolism in T. vivax, which might indicate an important species difference in energy metabolism. Comparative genomics suggests that T. vivax differs quite considerably from the model T. brucei; by illuminating the expression of distinctive features in the T. vivax genome, this study moves us closer to understanding their phenotypic effects.

Ethics statement
All mice were housed in the Institut Pasteur animal care facilities in compliance with European animal welfare regulations (European Convention for the Protection of Vertebrate Animals used for Experimental and other Scientific Purposes CETS No.: 123). Institut Pasteur is a member of Committee #1 of the Comité Régional d'Ethique pour l'Expérimentation Animale (CREEA), Ile de France. Animal housing conditions and the protocols used in the work described herein were approved by the ''Direction des Transports et de la Protection du Public, Sous-Direction de la Protection Sanitaire et de l'Environnement, Police Sanitaire des Animaux" (#B 75- [15][16][17][18][19][20][21][22][23][24][25][26][27][28], in accordance with the Ethics Charter of animal experimentation that includes appropriate procedures to minimize pain and animal suffering. Authorization (to PM) to perform experiments on vertebrate animals is granted by license #75-846 issued by the Paris Department of Veterinary Services, DDSV.

Cell culture
Trypanosoma (Duttonella) vivax IL 1392 was originally derived from the Zaria Y486 Nigerian isolate. Bloodstream form parasites were maintained in vivo by continuous passage in mice, as previously described [29]. Once parasitemia reached at least 5x10 8 parasites per ml blood was collected by cardiac puncture onto heparin (2500 IU/kg), and was then diluted 1: 10 (v/v) with PBS 0.5% glucose to 5x10 7 parasites per ml. Parasites were separated from red blood cells by differential centrifugation using a swing-out rotor (Jouan GR412, Fisher Bioblock Scientific, Strasbourg, France). Diluted blood was processed by one round of centrifugation (5 min at 200 g) and the supernatant withdrawn with a pipette without disturbing the red blood cell layer and the thin interface containing the white blood cells. Parasite enriched suspension was submitted to a second round of centrifugation (5 min at 200 g) to eliminate all residual cells. The supernatant was then centrifuged for 10 min at 1800 g and bloodstream form-containing pellets devoid of host cells were submitted to two further PBS washes under the same centrifugation conditions. Bloodstream form-containing pellets were further treated for RNA or protein extractions.
T. vivax epimastigote cultures have been previously described [28]. Briefly, bloodstream forms purified as described above from infected mice differentiated into epimastigotes in TV3 media: IMDM 50%, DMEM (without glucose) 10% heat-inactivated fetal calf serum (FBS, MP Biomedicals or Invitrogen) and/or 10% heat-inactivated goat serum (GS, Invitrogen), 0.03 mM bathocuproinedisulfonic acid, 0.45 mM L-cysteine, 0.2 mM hypoxanthine, 0.14 mM ß-mercaptoethanol, 4mM L-proline, 0.05 mM thymidine, and 25 mM HEPES pH7.4. All supplements were obtained from Sigma Aldrich except HEPES (Invitrogen, Cergy Pontoise). Epimastigote growth cultures were maintained in vitro by serial passages. Epimastigotes attached to the surface of the culture flask formed micro-colonies and covered the entire surface after two weeks; the number of cells in the supernatant increased proportionally to the density of the adherent cell layer. Adherent epimastigotes were recovered from the flask by scraping and washed three times with PBS. As previously described, metacyclic forms are produced during in vitro growth and are found in the cell culture supernatant [28]. Metacyclic forms were isolated from the cell culture using an approach derived from "bovine plasma aggregation method" [30]. Supernatant from a dense culture (14 days) was remove from the flask, 30% non-inactivated goat serum was added to the cells and incubated at 27°C for 30 min. During the incubation period, epimastigotes aggregate into cell clumps, while metacyclic forms continue to swim freely. The metacyclic forms were then separated from the epimastigote clumps by passing the trypanosome suspension through a 5 μm pore size filter (Millipore Cat. Bedford, MS, USA). The metacyclic forms were then concentrated and washed by centrifugation at 750g for 15 min in 14 ml conical centrifuge tubes and RNA or protein prepared from the resultant cells pellets.

Sample preparation for RNA-seq
Total RNA was isolated using an RNeasy Mini Kit (Qiagen, Courtaboeuf, France) in accordance with the manufacturer's instructions. RNA purity and concentration were evaluated by spectrophotometry using NanoDrop ND-2000 (ThermoFisher). RNA quality and the relative contributions of total and small RNA were assessed by the Agilent 2100 Bioanalyzer microfluidics-based platform (Agilent Technologies, Santa Clara, USA). Four biological replicates were prepared for bloodstream form and metacyclic cells each. Five replicates were produced for epimastigote cells.

RNA sequencing
For each replicate, poly-adenylated RNA (mRNA) was purified from total RNA using an oligo-dT magnetic bead pull-down, using TruSeq RNA Sample Prep v2 kits (Illumina). The mRNA was then fragmented using metal ion-catalyzed hydrolysis. A random-primed cDNA library was synthesized and double-strand cDNA was used as the input to a standard Illumina library preparation, with a fragment size of 400bp. The libraries were amplified with 10 cycles of PCR using KAPA Hifi Polymerase. Samples were quantified and pooled based on a post-PCR Agilent Bioanalyzer, followed by size-selection using the LabChip XT Caliper. The multiplexed library was sequenced on the Illumina HiSeq 2000 with forward and reverse primers, according to the manufacturers standard protocol, resulting in 100-nucleotide paired-end reads. Sequenced data was analyzed and quality controlled and individual indexed library BAM files created.

Transcriptomic data analysis
Paired-end RNA-seq data were mapped to the T. vivax Y486 reference strain [11] (downloaded from TritrypDB release 6.0) using Bowtie2 [31] under the default parameters and within the Galaxy bioinformatics platform [32]. Transcript abundance for each replicate was estimated across the genome using Cufflinks [33] and measured in Fragments Per Kilobase Mapped (FPKM). The option for quartile normalization within Galaxy was applied to maximize our ability to detect preferential expression of low abundance transcripts against the background of highly abundant species. The option for bias detection and correction was enforced. The option for multi-read correction was applied because some of our genes of interest are multi-copy and may map to multiple locations. Fold change in transcript abundance, and significance of differential expression, was estimated using Cuffdiff [33] for three pairwise comparisons of T. vivax life stages, combining all replicates in each case. Cuffdiff applies the Benjamini-Hochberg correction for multiple testing when assessing the significance of fold changes. To ensure accurate assessment of differential expression, transcript abundance was corroborated using a second method, edgeR [34]. Correlations for fold change in transcript abundance returned by Cufflinks and edgeR displayed high congruence when comparing life stages (r 2 = 0.89-0.91). Significant differences in transcript expression were defined as at least 2-fold enrichment between conditions and q < 0.05, where q is the p value corrected for false discovery rate (FDR).

Sample preparation for proteomics
Protein from cell lysates was dispensed into low protein-binding microcentrifuge tubes (Sarstedt, Leicester, UK) and made up to 160 μl by addition of 25 mM ammonium bicarbonate. The proteins were denatured using 10 μl of 1% (w/v) RapiGest (Waters MS Technologies, Manchester, UK) in 25 mM ammonium bicarbonate followed by three cycles of freeze-thaw, and two cycles of 10 min sonication in a water bath. The sample was then incubated at 80°C for 10 min and reduced with 3 mM dithiothreitol (Sigma-Aldrich, Dorset, UK) at 60°C for 10 min then alkylated with 9 mM iodoacetamide (Sigma-Aldrich, Dorset, UK) at room temperature for 30 min in the dark. Proteomic grade trypsin (Sigma-Aldrich, Dorset, UK) was added at a protein:trypsin ratio of 50:1 and samples incubated at 37°C overnight. Three biological replicates were prepared for each cell type.

LC-MS/MS analysis
Peptide mixtures were analyzed by on-line nanoflow liquid chromatography using the nanoACQUITY-nLC system (Waters MS technologies, Manchester, UK) coupled to an LTQ-Orbitrap Velos (ThermoFisher Scientific, Bremen, Germany) mass spectrometer equipped with the manufacturer's nanospray ion source. The analytical column (nanoAC-QUITY UPLCT BEH130 C18 15 cm x 75 μm, 1.7 μm capillary column) was maintained at 35°C and a flow-rate of 300nl/min. The gradient consisted of 3-40% acetonitrile in 0.1% formic acid for 90 min then a ramp of 40-85% acetonitrile in 0.1% formic acid for 3 min. Full scan MS spectra (m/z range 300-2000) were acquired by the Orbitrap at a resolution of 30,000. Analysis was performed in data-dependent mode. The top 20 most intense ions from MS1 scan (full MS) were selected for tandem MS by collision induced dissociation (CID) and all product spectra were acquired in the LTQ ion trap. Ion trap and Orbitrap maximal injection times were set to 50 ms and 500 ms, respectively.

Proteomic data analysis
Thermo RAW files were imported into Progenesis LC-MS (version 4.1, Nonlinear Dynamics, UK). Runs were time aligned using default settings and using an auto selected run as reference. Peaks were picked by the software and filtered to include only peaks with a charge state of between +2 and +6. Peptide intensities were normalized against the reference run by Progenesis LC-MS and these intensities are used to highlight differences in protein expression between control and treated samples with supporting statistical analysis (ANOVA and q-values) calculated by the Progenesis LC-MS software. Spectral data were transformed to mgf files with Progenesis LC-MS and exported for peptide identification using the Mascot (version 2.3.02, Matrix Science) search engine. Tandem MS data were searched against a custom database that contained the common contamination and protein sequences predicted for the T. vivax reference genome (downloaded from TriTrypDB v-6.0). Search parameters were as follows; precursor mass tolerance set to 10ppm and fragment mass tolerance set to 0.5 Da. One missed tryptic cleavage was permitted. Carbamidomethylation (cysteine) was set as a fixed modification and oxidation (methionine) set as a variable modification. Mascot search results were further processed using the machine learning algorithm Percolator. The false discovery rates were set at 1% and at least two unique peptides were required for reporting protein identifications. Protein abundance (iBAQ) was calculated as the sum of all the peak intensities (from Progenesis output) divided by the number of theoretically observable tryptic peptides [35]. Protein abundance was normalized by dividing the protein iBAQ value by the summed iBAQ values for that sample. The reported abundance is the mean of the biological replicates.

Data accessibility
All cDNA sequence data are available from the European Nucleotide Archive (http://www.ebi. ac.uk/ena), accession number ERP001753. Details of the transcriptomic experiments are also available from the Array Express website (https://www.ebi.ac.uk/arrayexpress/), accession number E-ERAD-100. The mass spectrometry proteomics data have been deposited with the ProteomeXchange Consortium via the PRIDE partner repository (http://www.ebi.ac.uk/pride/ archive/) with the dataset identifier PXD001617.

Estimation of transcript and peptide abundance
By exploiting new protocols for the in vitro cultivation of T. vivax epimastigote and metacyclic forms, we have produced comparative transcriptomic and proteomic data for the whole T. vivax life cycle using RNAseq and LC-MS/MS approaches respectively. Transcripts were detected for 10116 T. vivax Y486 genes (85.1% of all genes); 8994 of these transcripts (88.9%) were observed with at least 10 FPKM. The abundance of each transcript, as estimated using Cufflinks [33], is described in S1 Table. The most abundant transcripts in the bloodstream form were derived from tubulins, diverse ribosomal proteins and VSG-like sequences TvY486_0009580 (17331 FPKM) and TvY486_0018880 (9669 FPKM), which are assumed to have been the active VSG at the time of sequencing. Besides these, abundant transcripts encoding named proteins concern glyceraldehyde 3-phosphate dehydrogenase (TvY486_0603710; 1512 FPKM), a receptor-type adenylate cyclase (TvY486_0029610; 1210 FPKM), cathepsin Blike cysteine peptidase (TvY486_0600060; 794 FPKM), and an uncharacterized gene specific to T. vivax (TvY486_0900440; 1646 FPKM). The most abundant transcripts in the epimastigote and metacyclic cells encoded the same set of highly abundant tubulins and ribosomal proteins, but not the putative VSG, displaying instead an abundance of BARP-like proteins TvY486_0012620 (975 FPKM) and TvY486_1114940 (847 FPKM). Abundance estimates across our independent replicates were consistent, with strong positive correlations of replicates (ranging from 0.94 to 0.99) across all life stages (S1 Fig); and when fold change in transcript abundance is compared between life stages using edgeR, replicates cluster by stage illustrating their consistency (S2 Fig). Peptide abundance, as defined by quantitative analysis with MASCOT, is described in S2 Table. 11099 peptides were counted corresponding to 1952 proteins (16.3% of predicted proteome). Of these, 1245 were sufficiently abundant to be quantified by iBAQ (i.e. two unique peptides were observed with a FDR of 0.01). Of these, 798 had a q value < 0.05, meaning that differential expression can be reliably inferred. The most abundant peptides in the bloodstream form were alpha and beta tubulin, various histones, putative VSG (TvY486_0009580/ TvY486_0018880; i.e. coinciding with the most abundant VSG-like transcripts), and metabolic enzymes such as fructose-bisphosphate aldolase, enolase, glutamate dehydrogenase, arginine kinase, phosphoglycerate mutase, succinyl-coA:3-ketoacid-coenzyme A transferase and glycerol-3-phosphate dehydrogenase. The most abundant peptides in the epimastigote and metacyclic stages largely belonged to the same set of proteins, except that VSG were not observed and glycolytic enzymes were less abundant. As with transcript abundance, the proteome was consistent between independent replicates for each life stage, as illustrated in a principle component analysis in which replicates cluster tightly by stage ( S3 Fig).
The degree to which relative abundance of transcripts and peptides concur throughout the life cycle is an important question with implications for regulation of gene expression, especially in trypanosomatids in which regulation is thought to be mostly post-transcriptional [27]. Correlations of transcript and peptide abundance across the genome (a-c) and for differentially expressed genes in each life stage (d-f) are shown in S4 Fig. These graphs show that the correlation is poor for all genes (r 2 between 0.22 and 0.36) but improved for genes with evidence of developmental regulation (r 2 between 0.38 and 0.65).

Developmental regulation of transcripts
Differential expression is defined as significant where transcript abundance displays at least two-fold enrichment and where q < 0.05. We found that 11.2% (1137) of transcripts showed significant differential expression in one or more stage comparison; we refer to these as 'developmentally regulated' and they are listed in S3 Table. In bloodstream forms, 518 transcripts were significantly more abundant relative to epimastigotes, and 382 transcripts were significantly more abundant in bloodstream forms relative to metacyclics. The greatest enrichment in favor of bloodstream forms concerned the putative active VSG (TvY486_0018880; fold-change (FC) = 110.9); other large fold-changes that implicated named sequences concerned three receptor-type adenylate cyclases (TvY486_0026190, TvY486_0003180 and TvY486_0029610; FC = 49.8, 18.9 and 10.8 respectively), a glycerol-3-phosphate dehydrogenase (TvY486_0802930; FC = 9.4) and a phospholipase A1 (TvY486_0102170; FC = 15.6). Besides these instances, the majority of transcripts (83%) preferentially expressed in bloodstream forms encode hypothetical proteins. Among these are hypothetical proteins belonging to T. vivax-specific families that are included in a Cell-Surface Phylome (CSP) that we published previously [26] for gene families predicted to be expressed on the cell surfaces of the three principal African trypanosome species; i.e. Fam30 (e.g. TvY486_0003670; FC = 29.4), Fam28 (e.g. TvY486_0030920; FC = 27.5), Fam34 (e.g. TvY486_0009950; FC = 26.5) and Fam31 (e.g. TvY486_0000210; FC = 23.4). Yet another family of uncharacterized genes, unique to T. vivax but not included in the CSP presently, show greater differential expression in bloodstream forms than any other family except for VSG. This gene family occurs 25 times among transcripts up-regulated in bloodstream forms relative to epimastigotes (S3 Table) and provides four of the 20 largest fold-changes in favor of bloodstream forms (e.g. TvY486_0033680, FC = 38.1). A BLASTp analysis shows that this family has at least 44 members across the T. vivax genome but none of these paralogs were observed to be preferentially expressed in either epimastigote or metacyclic form.
In epimastigotes, we identified 393 transcripts that were developmentally regulated, 387 of which are significantly more abundant in epimastigotes relative to bloodstream forms, while 8 transcripts are significantly more abundant in epimastigotes relative to metacyclic forms (see S3 Table). The dearth of preferential expression in epimastigotes relative to metacyclics was only slightly relieved by analysis with edgeR, which reported 23 cases. Since it was necessary to grow epimastigote cultures to high density in order to achieve a high proportion of metacyclic cells, it is possible that the lack of significant differences between these cell types is due to the effects of high density on growth. As with bloodstream forms, most developmentally regulated transcripts encode hypothetical proteins (65.1%), including those with the greatest fold changes in expression, i.e. TvY486_1110640 (FC = 73.1). Transcripts preferentially expressed in epimastigotes also concern further T. vivax-specific, CSP gene families, namely Fam35 (11 paralogs; FC = 5.4-43.7) and Fam43 (three paralogs; FC = 35.0-53.8). However, these gene families were more abundant still in metacyclic forms (see below), indicating that their main focus of expression was not the epimastigote. Aside from these uncharacterized gene families, transcripts implicated in cellular respiration were also seen, for example, components of the electron transfer chain such as cytochrome c1 (TvY486_0801280; FC = 13.0), cytochrome c (TvY486_0804690; FC = 13.2) and cytochrome c oxidase subunits (FC between 4.3-20.9). Also transcripts for multiple cation transporters (FC between 3.8-24.7) and a meiotic recombination protein DMC1 (TvY486_0904120; FC = 10.5).
In the final, metacyclic life stage, 357 transcripts were significantly more abundant relative to bloodstream forms, and these gave a very similar picture to the enriched transcripts in epimastigotes. A further 136 transcripts were significantly more abundant relative to epimastigotes (see S3 Table) including the T. vivax-specific, CSP families 34, 35 and 43 (see above), and other transcripts encoding DNA polymerase kappa (TvY486_1109280; FC = 2.8), an adenylate cyclase (TvY486_0029610; FC = 2.5), and various reverse transcriptases derived from SLACS elements (average FC = 6.8).
With respect to these observations, it should be noted that further analysis using edgeR produced very similar results to Cufflinks, with only 8.6% of the gene set displaying significant differential expression. Also, there was substantial overlap in the identities of developmentally regulated transcripts between comparisons; thus, of 518 transcripts significantly enriched in bloodstream forms relative to epimastigotes, 372 of these were also enriched relative to metacyclics; similarly, of 387 transcripts significantly enriched in epimastigotes relative to bloodstream forms, 285 of these were also enriched in metacyclics relative to the latter. From this it should be clear that, of the three life stages, the epimastigotes and metacyclic transcriptomes were most alike.
We examined the developmental regulated transcripts for Gene Ontology (GO) terms that were significantly enriched using a Fishers Exact test in BLAST2GO [36]. This confirmed that transcripts preferentially expressed in bloodstream forms are enriched for terms associated with glycolysis (GO:0006096; q = 3.38e -04 ) and glycosome (GO:0020015; p = 1.83e -03 ), while those preferentially expressed in epimastigotes are enriched for cytochrome-c oxidase activity (GO:0004129; p = 2.01e -13 ) and ATP synthesis coupled proton transport (GO:0015986; p = 1.43e -03 ). While suggestive of consistent differences in energy metabolism between life stages, differences in transcript abundance do not guarantee disparity in protein expression; thus, we sought to corroborate these observations with our proteomic data.

Developmental regulation of proteins
A protein shows significant differential expression if a constitutive peptide displays at least 2-fold enrichment and q < 0.05. Under these criteria, 595 or 30.5% of observed proteins (74.6% of quantifiable peptides) were developmentally regulated and these are listed in S4 Table. In the bloodstream form, 131 proteins were significantly more abundant relative to epimastigotes, while 63 proteins were significantly more abundant relative to metacyclics. Compared to differentially expressed transcripts, there is a smaller proportion of hypothetical proteins (29%), suggesting that our proteomic analysis has captured the most abundant components of cellular physiology that are relatively well understood. Accordingly, various structural proteins and diverse enzymes are listed, most notably components of the glycolytic pathway such as phosphoglycerate mutase (TvY486_0302920; FC = 6.6), fructose-bisphosphate aldolase (TvY486_1005670; FC = 3.5) and glycerol-3-phosphate dehydrogenase (TvY486_0802930; FC = 5.5). Analysis of functional terms associated with peptides preferentially expressed in bloodstream forms shows that glucose metabolism is significantly enriched (GO:0006006; FDR = 5.6e -05 ) relative to epimastigotes, while the citric acid cycle is significantly enriched (KEGG; FDR = 4.6e -04 ) relative to metacyclics.
The multi-copy, T. vivax-specific transcripts described above as being the most highly abundant family in bloodstream forms were also identified among our peptides. Four members of this family are preferentially expressed in bloodstream forms, one relative to epimastigotes (TvY486_0008730; FC = 15.9) and three more relative to metacyclics. None were expressed in either insect stage, further indicating that this is a novel and very prominent feature of bloodstream forms.
In the epimastigote form, 147 and 104 proteins were significantly more abundant relative to bloodstream forms and metacyclics respectively. GO terms associated with oxidation-reduction processes (GO:0055114; FDR = 1.1e -03 ) and amino acid metabolism (GO:0006520; FDR = 2.3e -02 ) were found to be significantly enriched. Peptides suggestive of oxidative phosphorylation, although no other elements of mitochondrial energy metabolism, were also significantly more abundant relative to metacyclics.
In the metacyclic stage, 62 peptides were significantly more abundant relative to bloodstream forms, and 89 peptides were significantly more abundant relative to epimastigotes. In both cases, the greatest fold changes pertain to hypothetical proteins encoded by members of CSP gene families (see below). Among the 88 peptides, various proteins with roles in intracellular trafficking were implicated, for example, the vesicle formation protein Sec24C (TvY486_0300580; FC = 11.5; [37]) and signal recognition particle receptor (TvY486_1110560; FC = 4.9), as well as several dynein heavy chains (FC between 2.2 and 7.1). A Fishers Exact test shows that functions associated with intracellular transport (GO:0046907; FDR = 2.0e -02 ) are significantly enriched. The same test shows that purine ribonucleotide catabolism (GO:0009154; FDR = 2.1e -03 ), (which refers here to the ATP-binding requirements of the same intracellular transport processes), is also significantly enriched.
It should be noted that, unlike the cohorts of stage-specific transcripts, there was no overlap in the membership of these various stage-specific peptide groups, i.e. there were no peptides significantly enriched in both epimastigotes and metacyclics relative to bloodstream forms.

Interspecies comparison of developmental regulation
Transcriptomic and proteomic data from diverse African trypanosomes indicate consistently that life stages differ with respect to primary energy metabolism. Previous studies of global gene expression T. brucei and T. congolense have shown developmental regulation of such genes, as well as those encoding the principal cell surface glycoproteins [12-13, 16-19, 21-22]. We would like to know how conserved such developmental regulation is through evolutionary time, and thus, how regulatory evolution might have contributed to phenotypic differences between species. Comparative genomics predicts that T. vivax lacks certain enigmatic components of T. brucei and T. congolense cell surfaces, such as the VSG-related transferrin receptor of bloodstream forms and procyclin, as well as some elements of Fam50 (see below) [26]. To relate global gene expression across the African trypanosomes, we compared our proteome with existing data sets for T. brucei [21][22] and T. congolense [13], calculating the fold change in abundance from the non-infective insect stage (i.e. epimastigote for T. vivax and procyclic form for T. brucei and T. congolense) to the bloodstream form, for each protein observed in all species (N = 128).  Subset a contains proteins that are preferentially expressed in the bloodstream form in all species; the GO term for glycolysis (GO: 0006096; FDR = 6.4e -06 ) is enriched among these proteins. This indicates that the use of substrate-level phosphorylation as the dominant process for ATP generation in the bloodstream form is a consistent feature of African trypanosomes. Subset b contains proteins with the opposite expression profile, i.e. preferentially expressed in the non-infective insect stage in all species. Analysis of GO terms associated with these proteins shows that proton-transporting ATP synthase activity (GO:0046933; FDR = 1.8e -03 ), ATP synthesis coupled proton transport (GO:0015986; FDR = 1.8e -03 ) and oxidation-reduction process (GO:0055114; FDR = 3.1e -03 ) are enriched. This is consistent with the widespread use oxidative phosphorylation in the low-glucose environment of the insect vector to generate ATP via a proton-motive force across the mitochondrial membrane. Hence, at the broadest level, developmental regulation is conserved across all species, as their shared insect host will have predicted. However, there are obvious differences also.
Against this background of conserved developmental regulation, we are interested in genes that are regulated differently in T. vivax, and which might contribute to its unique phenotypes. Subset c contains proteins that are significantly more abundant in the insect stages of T. brucei and T. congolense than the vertebrate stage, but preferentially expressed in T. vivax bloodstream forms. In the larger dataset of Fig 3, this cohort is expanded to 714 proteins by excluding T. congolense (for which proteome coverage is lowest); the number of proteins falling into subset c is increased to 27 and these are listed in Table 1. The expression profile of these proteins in T. vivax is not simply a lack of regulation or low expression generally, since many are highly abundant. Ten of the proteins in Table 1 appear in the top 10% of our proteome when ranked by abundance. Analysis of the GO terms associated with these proteins shows enrichment for succinate-CoA ligase (GDP-forming) activity (GO:0004776; FDR = 1.8e -02 ), pyruvate dehydrogenase (acetyl-transferring) activity (GO:0004739; FDR = 1.8e -02 ) and glucose metabolic process (GO:0006006; FDR = 2.8e -02 ). Hence, comparison of differentially expressed genes across African trypanosomes shows that much is conserved at the regulatory level but that important differences exist, even in the most essential physiology.

Expression of T. vivax-specific, cell surface-expressed gene families
We have previously identified several gene families, known as Fam27-45, that are predicted to encode cell surface proteins and which, being unique to T. vivax, might distinguish the parasite from T. brucei and T. congolense [26]. Fam27-45 are among the most highly expressed genes and these data are extracted in Table 2. Many of the CSP families unique to T. vivax also appear to be developmentally regulated at the transcript level. For example, Fam27 (five paralogs), Fam35 (11 paralogs) and Fam 43 (five paralogs) are preferentially transcribed in insect stages. Conversely, Fam29 (13 paralogs), Fam30 (38 paralogs) and Fam32 (seven paralogs) are preferentially transcribed in bloodstream forms. Indeed, rarely are transcripts belonging to one of the families found throughout the lifecycle; Fam34 (25 paralogs) being one such case. Fig 4 summarizes the evidence for differential expression of Fam27-45 at the transcript and peptide level. In four cases, (Fams 33, 40, 41 and 45), both transcriptomic or proteomic evidence for expression is lacking and we conclude that these sequence families do not encode protein-coding genes (and so should be removed from the CSP). As expected, proteomic evidence for gene expression is not as prevalent as transcriptomic data, although it generally corroborates the latter when it is available.
The best supported cases for developmental regulation concern the metacyclic-specific expression of Fam34, 35 and 43. We observed 34 distinct Fam34 transcripts and 24 of these were differentially expressed; 12 in the metacyclic and another 12 in the bloodstream form.  However, the proteomic evidence indicates more selective developmental regulation; of 11 Fam34 proteins that were observed, five were preferentially expressed and all in the metacyclic stage (FC between 2.2-10.8).
Of 17 distinct Fam35 transcripts, 11 were differentially expressed; all were significantly more abundant in the metacyclic stage (FC between 13.2-92.0). Proteomic data support this view; of the six Fam35 peptides observed, all were most abundant in the metacyclic stage and two significantly so (TvY486_0041300 (FC = 6.55) and TvY486_0039920 (FC = 22.19)).
We observed seven distinct Fam43 transcripts and of these five were differentially expressed, all most abundant in the metacyclic (FC between 19.7-86.3). All three Fam43 peptides that were observed were preferentially expressed in the metacyclic stage (FC between 19.8-30.9). Hence, these results indicate that the putative T. vivax-specific gene families are (mostly)

Expression of 'Fam50' genes
Fam50 is a CSP gene family that includes the BARP genes of T. brucei and the GARP and CESP genes of T. congolense, known to be preferentially expressed on their respective cell surfaces during the insect stages [25,[38][39][40], as well as various, currently uncharacterized, genes that may also be transcribed preferentially during insect stages [41]. The genomic complement of Fam50 genes in T. vivax is smaller and less diverse than those of the other species, which may reflect the simpler existence of T. vivax in the tsetse fly [26]. Our transcriptomic data include all 17 T. vivax Fam50 genes, 13 of which are transcribed most abundantly in the insect stages (see S1 Table). Six transcripts are significantly more abundant in the epimastigote or metacyclic stage relative to bloodstream forms (Table 3a) and one of these was confirmed by the proteomic data (i.e. TvY486_0012620). In total, five proteins were detected and all were differentially expressed (Table 3b); four were most abundant in the insect (FC between 2.2 and 4.2). A single protein (corresponding to TvY486_0001140) was significantly more abundant in the bloodstream form. Thus, while transcriptomic data seems to be a poor predictor of differential expression of Fam50 proteins, perhaps suggesting the highly dynamic promotion and repression of Fam50 variants, there is good evidence for developmental regulation of both Fam50 transcripts and peptides, largely in preference for the insect stages and so consistent with observations in other species.

Expression of VSG-like genes
The bloodstream forms of African trypanosomes are defined partly by the expression of a VSG coat on the cell surface. In our analysis, we observed 89 distinct VSG transcripts with q < 0.05 (S1 Table), of which 61 were most abundant in bloodstream forms; however, most of these were observed at very low levels. There were 12 transcripts displaying significant preferential expression in the bloodstream stage (FC between 2.5-110.9; Table 4a). In our proteomic analysis we recorded nine distinct VSG sequences, of which three are represented by a single peptide and so unquantified (S2 Table). Of the remaining six (Table 4b), three were most abundant in bloodstream forms, including the two putative active VSG and a third sequence (TvY486_0000810, or identical paralog) that was expressed at a much lower level but still preferentially in bloodstream forms (FC = 5.7). A fourth VSG was expressed preferentially in the metacyclic stage (TvY486_0027560; FC = 2.8). Finally, two VSG were most abundant in the epimastigote; one of these (TvY486_0001860; FC = 4.3) was differentially expressed and the second nearly so (TvY486_0041140; FC = 2.1; q = 0.066). Notably, these low abundance VSG expressed in epimastigotes belong to a T. vivax-specific VSG-like family (Fam25), which is not seen in other African trypanosomes.

Discussion
We have produced transcriptomes and proteomes for three different developmental forms of T. vivax, and identified the transcripts and peptides that are significantly enriched in each. These data provide the first profile of global gene expression and developmental regulation throughout the complete T. vivax life cycle. The profile suggests a situation broadly similar to that already observed in T. brucei and T. congolense, though with significant distinctions, not least further evidence for developmental regulation of species-specific cell surface glycoproteins in both the vertebrate and insect stages of T. vivax. Such are their sensitivities, transcriptomic methods typically provide much greater coverage of the genome than do proteomic methods. This might be more pronounced for trypanosomatids since they constitutively express all genes within polycistronic transcripts and regulate protein expression post-transcriptionally [27]. Thus, 85.1% of T. vivax genes are represented in our transcriptome, but few transcripts are unique to a particular life stage and the proportion of differentially expressed transcripts is only 11.2%. By contrast, the proteome represents only 16.3% of all genes, but in 798 cases where differential expression could be assessed, 74.6% of peptides show significant differential expression and these are unique to one life stage. It may be that developmentally regulated proteins are also particularly abundant, certainly this is true for the components of stage-specific cell surface coats, and this would cause differentially expressed peptides to be overrepresented within the proteome. Previous proteomic studies for T. brucei have reported more proteins than we have found here for T. vivax (i.e. 3553 [21] and 3458 [22]) but a smaller proportion with significant differential expression, i.e. 24.8%/39.2% respectively. A proteomic analysis of T. congolense identified 1291 proteins, of which 21.5% displayed significant differential expression [13]. Hence, it may be that greater sensitivity would reduce the proportion of cases showing differential expression if low abundance proteins are more likely to be constitutively expressed.
Previously, comparison of the T. vivax genome sequence with those of T. brucei and T. congolense demonstrated that there are more than 2000 genes that are only present in T. vivax [11]. Their specificity, and typically the absence of any recognizable protein domains, make these genes obscure. Nonetheless, they appear to be genuine, since we found a lack of proteincoding evidence in only a few cases. One defining feature is that they comprise multi-copy gene families (Fam27-45 of the CSP) that are thought to be expressed on the cell surface based on the presence of a putative signal peptide, transmembrane domain and/or glycerophosphosinositol anchor in their predicted protein sequence [26]. Comparative genomics also showed that T. vivax lacked procyclin, the canonical cell surface glycoprotein of T. brucei and T. congolense insect stages [42]. This is consistent with T. vivax lacking a procyclic stage in the insect mid-gut and raises the question of what coats the T. vivax surface if not procyclin. Clearly, the abundant T. vivax-specific gene families offer plausible candidates for the role, but it could also be filled by Fam50; which has been shown to include various surface glycoproteins expressed during the insect stages of T. brucei and T. congolense [26].
Certainly, our data indicate that multiple Fam50 proteins are expressed in T. vivax and preferentially expressed in the epimastigote, while several transcripts (belonging to different loci) were significantly enriched in both epimastigotes and metacyclics. The fact that the transcripts and peptides are not derived from the same loci may suggest that different Fam50 genes became activated in the period between our preparation of RNA and protein. This is not the situation we observe for VSG, for which the identity of enriched transcripts and peptides largely match, suggesting that regulation of Fam50 gene expression is highly dynamic with multiple isoforms being promoted and repressed over short intervals. The presence of transcripts in the metacyclic stage at levels comparable to the epimastigote levels may be an artefact (i.e. residual epimastigotes in the metacyclic preparation), since Fam50 peptides in metacyclics are sparse and comparable in abundance to bloodstream forms. In short, the expression of multiple Fam50 proteins in the T. vivax epimastigote supports the view derived from T. brucei and T. congolense that this is a conserved family of glycoproteins performing diverse roles in the insect stages of the life cycle. Given that BARP in T. brucei and CESP in T. congolense are cell-surface glycoproteins [25,40], Fam50 homologs are therefore probably a prominent component of the epimastigote cell surface in T. vivax. Although Fam50 is preferentially expressed in epimastigotes, other multi-copy families, including the various T. vivax-specific cases, seldom are. This study has confirmed that most T. vivax-specific gene families are expressed. In the cases of Fam33, 40, 41 and 45, which should now be discounted from the CSP, the apparent lack of transcription raises the question of what function these repetitive non-coding sequences might perform. Three T. vivax-specific gene families (Fam34, 35 and 43) are very strongly enriched in the metacyclic stage, which is intriguing because in T. brucei and T. congolense the metacyclic coat is characterized by VSG. While metacyclic VSG are replaced by other VSG upon differentiation into bloodstream forms, and so are temporally distinct, there is no metacyclic-specific cohort of VSG sequences [8,11]. VSG may also be present on T. vivax metacyclics, since we observed a low abundance VSG protein preferentially expressed in metacyclics (TvY486_0027560). However, assuming that Fam34, 35 and 43 are expressed on the cell surface as predicted, it is clear that the infective form of T. vivax has a qualitatively different surface architecture to the other species, with a considerable non-VSG component. The same could be claimed for bloodstream forms also. Three families show exclusive enrichment in bloodstream forms at the transcript level (Fam28-30), though without proteomic support. This could indicate that our analysis lacked the sensitivity to detect them, perhaps because in bloodstream forms the superabundant VSG dominates the sequencing effort, making the detection of lower abundance proteins less effective than for either metacyclic or epimastigote. The presence of numerous non-VSG surface proteins might account for the observation that the T. vivax VSG coat is less dense than that of T. brucei [43], and VSG comprise a smaller proportion of cell surface-expressed T. vivax transcripts [14]. Assuming that their surface role is correct and they are confirmed as having preferential expression in bloodstream forms, these families are particularly interesting they have properties as surface antigens that could be targeted for vaccine development. Immune responses to the VSG do not provide lasting and comprehensive protection because of antigenic variation and the considerable structural diversity of the VSG repertoire. By contrast, Fam28-30 number not more than 30 paralogs, are less structurally diverse, and multiple transcripts are expressed at relatively high abundance, indicating that these families are not subject to antigenic variation, if monoallelic expression is a diagnostic feature of that process.
The VSG genes themselves present an expression profile typical of other African trypanosomes. VSG expression is regulated to produce a succession of structural variants that can evade specific immune responses but also prevent exposure of the total VSG structural repertoire to the host immune system, which would lead to a comprehensive immune response. Thus, VSG genes are expressed in a monoallelic fashion from the highly regulated context of a dedicated VSG expression site [44]. In their analysis of expressed sequence tags (EST) from different T. congolense life stages, Helm et al. (2009) recorded 13 distinct VSG transcripts in metacyclic cells, with the most abundant comprising 24% of the total number, and 26 distinct VSG transcripts in bloodstream forms, with the most abundant contributing 62% of the total [12]. This supports the established experimental model in which most individuals of a T. congolense population express the same active VSG, while a few individuals express a range of low abundance alternatives. In fact, when combined, the 12 least abundant VSG EST were only 0.5% of all VSG transcripts in bloodstream forms [12]. Similarly in T. brucei, Jensen et al. (2009) identified in a microarray-based study cohorts of less abundant transcripts in addition to the known, active VSG, some of which were expressed most abundantly in the insect stages [16]. In contrast, a previous analysis of VSG transcripts in T. vivax using 454 sequencing technology concluded that only one VSG was expressed [14].
Proteomic analyses have presented a similar picture. In T. congolense, 11 different VSGs were identified across all life stages [13]. Four were confidently associated with the metacyclic stage while two others were significantly enriched in bloodstream forms (including the known, active VSG). In proteomic comparisons of procyclic and bloodstream forms of T. brucei, one proteome identified 10 canonical VSGs [22], while another only two [21], although different T. brucei strains were used. These did not include the active VSG because neither study used the reference strain (927), and so the active VSG did not map to a VSG gene in the reference genome. Consequently, the 10 VSGs identified by Butter et al. (2013) are all low abundance alternatives, represented by few peptides (< 7) and achieving poor coverage (< 9%) [22]. Three of these low abundance VSGs were preferentially expressed in procyclic forms [22]. Taking these previous data together, their obvious methodological variations notwithstanding, low abundance alternatives to the dominant VSG are observed at both the transcript and protein levels in both T. brucei and T. congolense. The role, if any, of these 'accessory' VSGs is unclear; some are very likely metacyclic VSGs and it is known that expression of these can continue several days after transmission [45] and so could be present in bloodstream forms. Alternatively, 'accessory' VSGs may simply be rare antigens expressed by low frequency subpopulations, or produced by inefficiency in the mechanism silencing inactive VSG expression sites.
In contrast to the previous result [14], we observed several low abundance VSGs in T. vivax consistent with the expression profiles of VSG observed in T. brucei and T. congolense. The two dominant VSG sequences were superabundant at both transcript and peptide levels. Therefore the identity of the active VSG remained constant in the period between RNA and protein preparation, meaning that this is unlikely to represent a transition between two VSGs and that T. vivax strain IL1392 probably a mixture of parasites expressing one of two different VSGs. Both of these active VSG belong to Fam24, the subtype homologous to canonical b-type VSG in T. brucei and T. congolense [11]. In a similar fashion to T. congolense, the less abundant VSG in T. vivax may represent metacyclic VSGs. One VSG, TvY486_0027560, may be a metacyclic VSG in this strain as it was preferentially expressed in the metacyclic form (its transcript was not recorded). Finally, two VSG-like sequences belonging to Fam25, a T. vivax-specific subtype [11], are preferentially expressed at low levels in epimastigotes. The roles of Fam25 and 26 genes remain mysterious, and there is no definitive evidence that they encode variant antigens.
Beyond the differences in genetic repertoire that are evident from comparative genomics, it is presumed that differences in the regulation of conserved genes will contribute to phenotypic differences between African trypanosomes. The cohort of conserved genes identified in Figs 2 and 3 that are regulated conversely in T. vivax relative to T. brucei (and probably T. congolense) indicate that this is so. In the bloodstream stage, African trypanosomes exclusively employ glycolysis to exploit abundant glucose in host plasma to generate ATP via substrate-level phosphorylation in the glycosome [46]. In the tsetse fly, where glucose is limited but amino acids such as proline are present in the host hemolymph, the parasites generate ATP through gluconeogenesis and oxidative phosphorylation in the mitochondrion [46]. In this regard, T. vivax is consistent; all glycolytic enzymes are preferentially expressed in the bloodstream form, where they are among the most abundant transcripts and peptides, and all components of the electron transfer chain are preferentially expressed in the epimastigote (S3 and S4 Tables). The species differences highlighted in Fig 3 concern the metabolic steps linking glycolysis and events in the mitochondrion, i.e. pyruvate metabolism (Fig 5).
Experimental evidence indicates that T. brucei produces ATP during its insect stage by further substrate-level phosphorylation in the glycosome, by catabolizing phosphoenolpyruvate (PEP), and in the mitochondrion by catabolizing pyruvate. This results in insect forms excreting succinate and acetate, while bloodstream forms excrete pyruvate [47]. Accordingly, the enzymes for converting PEP into succinate and pyruvate into acetate are preferentially expressed in the procyclic form of T. brucei [21,22]. Fig 5 describes the points in this pathway where differential expression is reversed in T. vivax. We see that enzymes for the catabolism of PEP, such as glycosomal malate dehydrogenase and glycosomal phosphoenolpyruvate carboxykinase, and for the conversion of pyruvate to acetate, i.e. multiple components of the pyruvate dehydrogenase complex and of the succinyl-CoA synthetase complex, are significantly more active in the bloodstream form than in the epimastigote. Additionally, the fumarase responsible for reaction 13 in Fig 5, while upregulated in procyclic form T. brucei, is constitutively expressed in T. vivax (TvY486_1105200; FC = 0.04). However, the final enzyme in the pathway (NADHdependent fumarate reductase; reaction 14) is preferentially expressed in the insect stages in both species. We speculate that some other genes in Table 1 support this function; for example, TvY486_0901260, which possesses a mitochondrial pyruvate carrier protein domain homologous to mt1 in Humans, which is required to import pyruvate across the inner mitochondrial membrane [48], and TvY486_0702860, which encodes a bacterial-type nitro-FMN oxidoreductase that might serve to regenerate NAD+ [49]. Energy metabolism in African trypanosomes, noting the position of enzymes with T. vivax-specific developmental regulation. Glycolysis takes place within a specialized organelle, the glycosome, after which further substrate level phosphorylation takes place through the conversion of phosphoenolpyruvate ultimately to succinate in the glycosome, and through the conversion of pyruvate into acetate in the mitochondrion. Points marked with red dots and labels shaded red refer to proteins that are preferentially expressed during the vertebrate stage of T. vivax but in the insect stages of T. brucei (after Besteiro et al. 2005). Note that we could not differentiate between cytosolic and glycosomal phosophoglycerate kinase isoforms using our proteomic data. Abbreviations: 1,3BPGA, 1,3-bisphosphoglycerate; CoASH, coenzyme A; DHAP, dihydroxyacetone phosphate; F-6-P, fructose 6-phosphate; FBP, fructose 1,6-bisphosphate; G-3-P, glyceraldehyde 3-phosphate; G-6-P, glucose 6-phosphate; GLU, glutamate; Gly-3-P, glycerol 3-phosphate; Oxac, oxaloacetate; PEP,phosphoenolpyruvate; 3-PGA, 3-phosphoglycerate; SucCoA, succinyl-CoA. Enzymes are: 1) hexokinase: 2) glucose-6-phosphate isomerase; 3) phosphofructokinase; 4) aldolase; 5) triose-phosphate isomerase; 6) glycerol-3-phosphate dehydrogenase; 7) glycerol kinase; 8) glyceraldehyde-3-phosphate dehydrogenase; 9) phosphoglycerate mutase; 10) enolase; 11) pyruvate kinase; 12) pyruvate phosphate dikinase; 13) glycosomal fumarase; 14) NADH-dependent fumarate reductase; 15) acetate:succinate CoA-transferase; 16) possibly acetyl-CoA synthetase. Thus, we would predict that T. vivax excretes fumarate, acetate and perhaps succinate in its bloodstream stage rather than in the insect. It is not clear why T. vivax would benefit from pyruvate metabolism in the bloodstream when substrate-level phosphorylation using glucose should suffice. However, in the insect stage, when the parasite remains in the proboscis and without access to the hemolymph, it could be that such metabolism serves little purpose. Therefore, this may reflect a lack of upregulation in the epimastigote rather than adaptive upregulation in the bloodstream form, illustrating how life cycle variation has affected the regulation of energy metabolism in these organisms.
The first global perspective on gene expression in T. vivax has confirmed that a broadly similar process of developmental regulation occurs in all African trypanosome species. However, subtle differences, for instance in energy metabolism and putative cell surface molecules, offer new insights into the molecular basis for the life cycle differences that exist between species. Beyond the background of conservation, this study has confirmed the presence of numerous T. vivax-specific gene families and shown that these are developmentally regulated, indicating that the surface of T. vivax differs quite substantially from the model derived from other African trypanosomes.