Skip to main content
  • Loading metrics

Four layer multi-omics reveals molecular responses to aneuploidy in Leishmania

  • Bart Cuypers,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Molecular Parasitology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine, Antwerp, Belgium, Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium

  • Pieter Meysman,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Supervision, Writing – review & editing

    Affiliation Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium

  • Ionas Erb,

    Roles Investigation, Methodology, Writing – review & editing

    Affiliation Centre for Genomic Regulation (CRG), The Barcelona Centre of Science and Technology, Barcelona, Spain

  • Wout Bittremieux,

    Roles Investigation, Methodology, Writing – review & editing

    Affiliation Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, United States of America

  • Dirk Valkenborg,

    Roles Investigation, Methodology, Resources, Writing – review & editing

    Affiliations Center for Statistics - University of Hasselt, Hasselt, Belgium, Center for Proteomics, University of Antwerp, Antwerp, Belgium

  • Geert Baggerman,

    Roles Investigation, Methodology, Resources, Writing – review & editing

    Affiliations Center for Proteomics, University of Antwerp, Antwerp, Belgium, VITO, Mol, Belgium

  • Inge Mertens,

    Roles Investigation, Methodology, Resources, Writing – review & editing

    Affiliations Center for Proteomics, University of Antwerp, Antwerp, Belgium, VITO, Mol, Belgium

  • Shyam Sundar,

    Roles Resources, Writing – review & editing

    Affiliation Institute of Medical Sciences, Banaras Hindu University, Varanasi, India

  • Basudha Khanal,

    Roles Resources, Writing – review & editing

    Affiliation BP Koirala Institute of Health Sciences, Dharan, Nepal

  • Cedric Notredame,

    Roles Conceptualization, Investigation, Methodology, Writing – review & editing

    Affiliation Centre for Genomic Regulation (CRG), The Barcelona Centre of Science and Technology, Barcelona, Spain

  • Jean-Claude Dujardin ,

    Roles Conceptualization, Funding acquisition, Investigation, Methodology, Supervision, Writing – original draft, Writing – review & editing (J-CD); (MD); (KL)

    Affiliations Molecular Parasitology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine, Antwerp, Belgium, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium

  • Malgorzata A. Domagalska ,

    Contributed equally to this work with: Malgorzata A. Domagalska, Kris Laukens

    Roles Conceptualization, Investigation, Methodology, Supervision, Writing – original draft, Writing – review & editing (J-CD); (MD); (KL)

    Affiliation Molecular Parasitology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine, Antwerp, Belgium

  • Kris Laukens

    Contributed equally to this work with: Malgorzata A. Domagalska, Kris Laukens

    Roles Conceptualization, Funding acquisition, Investigation, Methodology, Supervision, Writing – original draft, Writing – review & editing (J-CD); (MD); (KL)

    Affiliation Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium


Aneuploidy causes system-wide disruptions in the stochiometric balances of transcripts, proteins, and metabolites, often resulting in detrimental effects for the organism. The protozoan parasite Leishmania has an unusually high tolerance for aneuploidy, but the molecular and functional consequences for the pathogen remain poorly understood. Here, we addressed this question in vitro and present the first integrated analysis of the genome, transcriptome, proteome, and metabolome of highly aneuploid Leishmania donovani strains. Our analyses unambiguously establish that aneuploidy in Leishmania proportionally impacts the average transcript- and protein abundance levels of affected chromosomes, ultimately correlating with the degree of metabolic differences between closely related aneuploid strains. This proportionality was present in both proliferative and non-proliferative in vitro promastigotes. However, as in other Eukaryotes, we observed attenuation of dosage effects for protein complex subunits and in addition, non-cytoplasmic proteins. Differentially expressed transcripts and proteins between aneuploid Leishmania strains also originated from non-aneuploid chromosomes. At protein level, these were enriched for proteins involved in protein metabolism, such as chaperones and chaperonins, peptidases, and heat-shock proteins. In conclusion, our results further support the view that aneuploidy in Leishmania can be adaptive. Additionally, we believe that the high karyotype diversity in vitro and absence of classical transcriptional regulation make Leishmania an attractive model to study processes of protein homeostasis in the context of aneuploidy and beyond.

Author summary

Leishmania are protozoan parasites causing severe and stigmatizing diseases in humans and animals, worldwide. The parasites show several unique molecular features, one of them being an unusually high tolerance to aneuploidy, i.e. the presence of an abnormal number of chromosomes in the cell. While this is generally deleterious for higher Eukaryotes, this seems to be advantageous for Leishmania in certain conditions. However, the molecular and functional consequences of aneuploidy for the pathogen remained poorly understood. Here we studied the four main molecular layers of Eukaryotic cells: the genome, transcriptome, proteome, and metabolome. We present their first integrated analysis in aneuploid Leishmania. Our analyses show a strong impact of chromosome copy number on transcript- and protein abundance levels, ultimately correlating with the degree of observed metabolic differences. However, some specific proteins (subunits of protein-complexes and non-cytoplasmic proteins) encoded by aneuploid chromosomes showed reduced dosage effects. Reciprocally, we also found that some differential transcripts and proteins (chaperones, chaperonins and heat shock proteins) of aneuploid strains originated from non-aneuploid chromosomes. Our study provides new insights into mechanisms of molecular adaptation and regulation in Leishmania and the role of aneuploidy in this process.


Aneuploidy is the presence of an abnormal number of chromosomes in a cell. All genes encoded by aneuploid chromosomes undergo a shift in their copy number or “gene dosage”. This large number of unbalanced dosage changes typically leads to broad-scale disruptions in the stoichiometric balances of transcripts, proteins, and metabolites, making aneuploidy most widely known for its detrimental effects. Indeed, aneuploidy can cause oxidative stress, misfolded protein stress (proteotoxicity), and metabolic stress at the cellular level [1]. In multicellular organisms, where cells are closely fine-tuned to work together, this often leads to growth defects or lethality [2,3].

Contrastingly, aneuploidy can also result in fitness gains. In fungi, specific karyotypes confer resistance against drugs, toxicants, or increase survival under nutrient limiting conditions [47]. Likewise, in cancer cells, aneuploidy can be tumor growth-promoting [8]. By altering the copy number of many genes at once, aneuploidy enables cells to explore a wide fitness landscape [9]. Particularly when facing very novel or competitive conditions, aneuploidy might allow the rapid selection of a beneficial phenotype from a pool of many divergent phenotypes [10]. Fully understanding this complex trade-off between aneuploidy fitness gains and losses requires a systems biological comprehension of its molecular impact.

In this context, a new and unique Eukaryote model is emerging, Leishmania. This protozoan parasite (Euglenozoa, Kinetoplastida) that is transmitted by sand flies and infects humans and various vertebrate animals shows a remarkably high tolerance for aneuploidy. A genomic analysis of 204 in vitro cultured Leishmania donovani strains uncovered up to 22 aneuploid chromosomes (out of 36) within a single strain [11], while in vivo, aneuploidy is less pronounced [12]. All 36 chromosomes have the capacity to become aneuploid and lead to viable parasites [13]. Strikingly, clonal populations of the parasite do not have uniform karyotypes but display a mosaic of them [14]. Single-cell sequencing of two clonal promastigote populations of L. donovani identified respectively 208 different karyotypes in 1516 sequenced cells (that were generated in approximately 126 generations) and 117 karyotypes in 2378 cells (56 generations). This high degree of genomic instability and mosaicism could explain the parasite’s ability to rapidly select more advantageous karyotypes upon encountering novel in vivo or in vitro environments [12].

Leishmania also features an unusual gene expression system that is markedly different from most other Eukaryotes. Transcription of protein-coding genes cannot be controlled individually as these lack separate RNA polymerase II promoters. Instead, they are transcribed constitutively in long polycistronic units that span 10’s to 100’s of functionally unrelated genes [15]. The abundance of individual mRNAs and proteins is still regulated, but essentially post-transcriptionally [12,16]. Thus, the most straightforward evolutionary path towards altered gene expression might be altered gene dosage, which could explain the high incidence of aneuploidy in this organism [12]. While aneuploidy in Leishmania is known to directly affect the transcription of encoded transcripts, its molecular impact further downstream remains poorly understood [12,17].

Here we present the first study that generated and integrated genome, transcriptome, proteome, and metabolome data of highly aneuploid L. donovani strains, in vitro. This multi-omics analysis unambiguously demonstrates that aneuploidy in Leishmania globally and drastically impacts the transcriptome and proteome of affected chromosomes, ultimately correlating with the degree of metabolic differences between strains. This impact is not restricted to a single life stage as we observed it in vitro in both the parasite’s proliferative and non-proliferative/infectious promastigote life stages. Interestingly, not all proteins responded equally to chromosome dosage changes; protein complex subunits and non-cytoplasmic proteins responded less or even not at all to aneuploidy. In parallel, the aneuploid Leishmania strains here studied also feature many differentially expressed proteins from non-aneuploid chromosomes. These were enriched for aneuploidy-related protein classes, including metabolism-related chaperones and chaperonins, peptidases, and heat-shock proteins. Through this integrated four ‘omic approach, we provide a systems-wide view of the molecular changes associated with aneuploidy in Leishmania.



The experiments in this study were performed on six cloned strains of Leishmania donovani: BPK275/0 cl18 (BPK275), BPK173/0 cl3 (BPK173), BPK282 cl4 (BPK282), BPK288/0 cl7 (BPK288), BPK178/0 cl3 (BPK178), BHU575/0 cl2 (BHU575) (S1A Table). A strain is a parasite population derived by serial passage in vitro from a primary isolate (from patient) with some degree of characterization, in our case bulk genome sequencing (definition modified from [18]). These strains were cloned to ensure genomic homogeneity. Parasites were cultured in HOMEM (Gibco) supplemented with 20% heat-inactivated fetal bovine serum (Biochrom), three days for replicative forms (LOG growth phase, 4–6 replicates) and six days for non-replicative forms (STAT, 4 replicates). Parasites from all replicates were washed twice by centrifugation at 18,000 x g at 0 °C, resuspension in ice-cold PBS buffer (PBS; pH 7.4 –Invitrogen), and pelleted at 18,000 x g at 0 °C. DNA, RNA, protein, and metabolite extractions were carried out in parallel (see further).

DNA/RNA extraction, library preparation, and sequencing

DNA and RNA were extracted from the parasite pellets with the AllPrep DNA/RNA Mini Kit (Qiagen). The RNA integrity was then verified with the RNA 6000 Nano Kit using a Bioanalyzer 2100 (Agilent). DNA was quantified with the Qubit dsDNA BR assay (Thermo Fisher Scientific) and RNA with the Qubit RNA HS Assay.

Library preparation and sequencing of the DNA samples were performed at the Beijing Genomics Institute (BGI). Libraries were prepared with the TruSeq Nano DNA HT sample prep kit (Illumina) and 2 x 151 bp sequenced on the Illumina HiSeq 4000 platform. RNA sequencing libraries were prepared using the Spliced leader sequencing (SL-Seq) method as described in Cuypers et al. (2017) [16]. This protocol makes use of the presence of the affixed 39 nucleotide sequence spliced-leader (SL) that is present at the 5’ end of all functional Leishmania mRNAs. RNAs containing a SL are selectively amplified with the protocol, and adapters required for Illumina sequencing are ligated. The SL-Seq libraries were 1 X 50 bp sequenced with the HiSeq 1500 platform of the Center of Medical Genetics Antwerp (Belgium).

Genome and transcriptome data analysis

The quality of the raw sequencing data was verified with FASTQC v0.11.4 [19]. Reads were quality trimmed using Trimmomatic v035 [20] with the settings ‘LEADING:20 TRAILING:20 ILLUMINACLIP:adapter.fa:2:40:15 SLIDINGWINDOW:4:20 MINLEN:36’. Reads were subsequently aligned to the L. donovani LdBPKv2 [21] reference genome using BWA Mem version 0.7.15 [22] with default parameters. The resulting bam files were sorted and indexed with samtools [23]. Mapping reports were generated using samtools flagstat.

Somy estimations and gene copy numbers were calculated as described in Downing (2011) [24]. Briefly, the somy value of a chromosome (S) was calculated by dividing its median sequencing depth (i.e. the number of times a nucleotide is read during sequencing, dch) by the median dch of all 36 chromosomes (dmch) and multiplying this value by two (for a predominantly diploid organism). Gene copy number per haploid genome (d) was defined as a raw depth for that gene (dr), divided by the median depth of its chromosome (dch): so that d = dr/dch. Somy values in our bulk-sequencing analyses are often not discrete due to mosaic aneuploidy (i.e. presence of different kayotypes among individual cells present in 1 strain) in Leishmania. Gene copy number estimations per cell (full depth, gene dosage) were then calculated by multiplying d with S. SNPs were previously characterized in [11]. RNA-Seq gene read count tables were generated with the HTSeq tool version 0.6.1 [25]. Count data were normalized for library size by dividing each transcript count by the geometric mean of all counts for that library [26].


Sample preparation.

Cell pellets were efficiently lysed using 200 μl RIPA buffer and 1x HALT protease inhibitor (Thermo Scientific), combined with a 30 sec sonication (Branson Sonifier SLPe ultrasonic homogenizer, Labequip, Ontario, Canada) with an amplitude of 50% on ice. After centrifugation of the samples for 15 min at 10,000 g at 4 °C, the cell pellet was discarded. Next, the protein concentration was determined using the Pierce BCA protein Assay kit in combination with a NanoDrop 2000 photospectrometer (Thermo Scientific).

For each sample, 25 μg of proteins were reduced using 2 μl of 200 mM tris(2-carboxyethyl) phosphine, in a volume of 20 μl 200 mM triethylammonium bicarbonate (TEAB), and incubated for 1 h at 55°C. After alkylation of the proteins with 2μL of 375 mM iodoacetamide for 30 min protected from light, 6 volumes of ice-cold acetone were added, and the samples were incubated overnight at -20°C. The next day, samples were centrifuged for 10 min at 10.000 g at 4°C, the acetone was removed, and pellets were resolved in 20 μl of 200 mM TEAB. Proteins were then digested with trypsin (Promega) overnight at 37°C with an enzyme trypsin ratio of 1:50. Before LC-MS/MS analysis, the samples were desalted with Pierce C18 spin columns according to the manufacturer’s instructions (Thermo Scientific).

Nano reverse-phase liquid chromatography and mass spectrometry.

Each of the digested samples was separated by nano reverse phase C18 (RP-C18) chromatography on an Easy-nLC 1000 system (Thermo Scientific, San Jose, CA) using an Acclaim C18 PepMap 100 column (75 μm x 2 cm, 3 μm particle size) connected to an Acclaim PepMap RSLC C18 analytical column (50 μm x 15 cm, 2 μm particle size) (Thermo Scientific, San Jose, CA). Of each sample, a total of 1μg of peptides were loaded on the column. Before loading, digests were dissolved in mobile phase A, containing 2% acetonitrile and 0.1% formic acid, at a concentration of 1μg/10μL and spiked with 20 fmol Glu-1-fibrinopeptide B (Glu-fib, Protea biosciences, Morgantown, WV). A linear gradient of mobile phase B (0.1% formic acid in 100% acetonitrile) from 0 to 45% in 90 min, followed by a steep increase to 100% mobile phase B in 10 min, was used at a flow rate of 300 nL/min. Liquid Chromatography was followed by MS (LC-MS/MS) and was performed on a Q-Exactive Plus mass spectrometer equipped with a nanospray ion source (Thermo Fisher, Waltham, MA, USA). The high-resolution mass spectrometer was set up in an MS/MS mode where a full scan spectrum (350–1850 m/z, resolution 70,000) was followed by a high energy collision activated dissociation (HCD) tandem mass spectra (100–2000 m/z, resolution 17,500). Peptide ions were selected for further interrogation by tandem MS as the twenty most intense peaks of a full scan mass spectrum. The normalized collision energy used was set at 28%. A dynamic exclusion list of 20 sec for the data-dependent acquisition was applied.

Proteomic data analysis.

Thermo raw files were converted to mzML files using MSConvert v3.0. Label-free protein quantification (LFQ) was carried out with MaxQuant version [27] using the following settings (other settings were kept default): Oxidation (M) and acetyl (Protein N-term) were indicated as variable modifications, carbamidomethyl (C) was indicated as a fixed modification, digestion with trypsin with maximum two missed cleavages, match between runs = yes, dependent peptides = yes. The search database was the LdBPKV2 proteome previously published by our group [12], and reversed proteins were used as decoys.


Sample preparation.

Sample pellets were extracted using a 1:3:1 (v:v:v) chloroform/methanol/water extraction, subjected to liquid chromatography using a 2.1 mm ZIC-HILIC column (Sequant), and analyzed using the Orbitrap Exactive (Thermo Fisher Scientific) platform at the Glasgow Polyomics Center (University of Glasgow, Glasgow, Scotland) exactly as described elsewhere [28,29]. Both negative and positive ion modes were run in parallel with rapid polarity switching. Samples were analyzed in randomized order and the same analytical batch. Additionally, to aid accurate metabolite identification, verify LC-MS stability, and identify contaminants, the following control samples were included as well: (1) solvents blanks, (2) serial dilutions of a pooled sample (Undiluted, 1/2, 1/4, 1/8, and 1/16), (3) Authentic standard mixes containing in total 217 metabolites (50–400 Da) and representing a wide array of metabolic classes and pathways, (4) An amino acid standard mix (Sigma Product No. A9906).

Metabolomic data analysis.

Data preprocessing was carried out with the XCMS 1.42.0 [30] and mzMatch 1.0.1 [31] packages in R, exactly as described elsewhere [32]. Briefly, raw spectra (mzXML format) were first subjected to retention time alignment with ObiWarp [33], after which peak detection was carried out with the centWave algorithm [34]. Peaks were then filtered: 1) Corresponding peaks from the four biological replicates were allowed a maximum reproducibility standard deviation (RSD) of 0.5, 2) Peak shape (CoDA-DW > 0.8), 3) Detection in at least 3 out of 4 replicates of a single strain and 4) Minimal peak intensity was set to 3000. The peaks were putatively identified using sequentially the LeishCyc database [35], LIPID MAPS [36], KEGG [37], the peptide database included in mzMatch, and the Human Metabolome Database [38]. Finally, the filtered peaks were normalized for total ion count and subjected to manual quality screening.

Integrated multi -omic analysis

Pairwise comparisons between aneuploid strains: For each comparison between two strains, we calculated the Log2 fold change (Log2FC) for genes (gene dosage), transcripts, proteins and metabolites. These Log2FC values were obtained by taking the Log2 ratio of the average abundance between two strains. Associated p-values were obtained by performing a Student t-test for each comparison and subsequently correcting it with the Benjamini-Hochberg algorithm to limit the False Discovery Rate (FDR) to 5%. A Log2FC cutoff of 1 and adjusted p-value cutoff of 0.05 was used for all ‘omic layers, except for gene dosage, for which only copy number differences of at least 0.5/haploid genome were considered as biologically significant [39].

Average transcript and protein abundance levels per chromosome per strain: First, the median was taken of the Log2 normalized transcript counts (see respective section) and Log2 LFQ protein values, per chromosome per strain. This yielded an expression value for each chromosome in each strain, for both transcript and protein expression. Then, for each chromosome, the median of the expression values of all strains that were disomic (determined based on DNA sequencing data) for that chromosome was selected as the ‘disomic reference value’ for that chromosome (i.e., the expression value that corresponds to the median expression of a transcript/protein on a disomic chromosome). The remaining per-strain per-chromosome expression were then divided by this disomic reference value, multiplied by 2. The code for these normalizations and analyses is available at (

All plots were generated using the ggplot v 2_2.1.0 package [40], except the Circos plots which were created with Circos v0.69 [41]. Gene ontology enrichments were calculated with the Fisher Exact test using Python 3.8.5 and the matplotlib v3.3.4 [42] and SciPy v1.6.1 [43] libraries. GO annotations were obtained from the gene ontology consortium (

Dosage attenuation analysis

Dosage attenuation analysis: This analysis was carried out on the only three chromosomes (Ld05, Ld08, and Ld33) that were at least once disomic (somy of 1.7–2.3), trisomic (somy of 2.7–3.3), and tetrasomic (somy of 3.5 or more) in our six study strains and had more than 10 detected proteins. The abundance values of transcripts (corrected for library size, see above) and proteins (LFQ) were divided by their abundance in the strain that was the most disomic for their encoding chromosome, according to the genome sequencing data. These expression values were then multiplied by 2 so that the normalized abundance at disomy = 2. Strains were always normalized by another strain than themselves to avoid underestimation of the standard deviation. Therefore, we do not have disomic distributions for Ld08 since Ld08 was exclusively disomic in BPK282, and thus already was used as reference strain.

Functional dosage attenuation analysis: For this analysis, we defined a protein’s attenuation level as its Log2 transcript fold change (i.e., its transcript level on the aneuploid chromosome divided by its transcript level on disomic chromosome), minus its Log2 protein fold change. We included only 85 transcripts and proteins for this analysis, which had very accurate protein measurements (SD between biological replicates < 0.25 Log2FC units). The following protein features were checked for their relation with protein dosage attenuation: degree (i.e., number of protein interactions as transferred from STRING), length (as a proxy for transcript and protein molecular size), GRAVY (grand average of hydropathy), solubility (soluble versus membrane protein) and subcellular location. We added transcript abundance as a covariate to the model, as the compensatory effect can be expected to become larger at higher differential transcript levels. The code these normalizations and analyses is available at (


Aneuploidy globally impacts transcriptome and proteome of affected chromosomes

We examined the impact of aneuploidy in L. donovani on transcriptome and proteome by integrated multi-omic analysis of 6 L. donovani strains from India and Nepal (S1A Table). These closely related strains are almost identical at the sequence level but differ greatly in their aneuploidy [44]. Specifically, the entire genome between any two strains of this set differs at most by 64 non-synonymous SNPs and one indel in coding regions (S1B Table). Thus, they are attractive model strains for studying the impact of aneuploidy on transcriptome and proteome with minimal interference of DNA sequence variation.

Genome, transcriptome, and proteome were obtained from the in vitro replicative promastigote stage (LOG) of the parasite. All genomes reached an average sequencing coverage of at least 26.5 fold, and 97.9% of the sequenced sites reached 10 X coverage or more. Transcript samples had on average 4.71 million reads and obtained an average mapping efficiency of 97%. Detailed mapping and coverage statistics for both genome and transcriptome data are available in S1C and S1D Table. Identified indels and estimations of chromosomal somy (i.e., average chromosome copy number in the population of cells) and dosage of each gene are available in S2 Table. Identified transcripts and their counts are available in S3 Table. Proteome analysis identified in total 3145 proteins across all samples (S4 Table).

Our 6 L. donovani strains showed high but varying degrees of aneuploidy (Fig 1). The most aneuploid strain was BPK288 with 12 aneuploid chromosomes, followed by BPK275 (10), BPK173 (10), BHU575 (10), BPK178 (9), and BPK282 (6). In L. donovani, chromosomes are numbered by increasing size. The baseline somy of chromosomes in L. donovani is 2N, except for chromosome 31, which is always 4N or more. Smaller chromosomes were more frequently aneuploid (Ld01, Ld05-Ld09, Ld11-Ld16), than larger chromosomes (Ld20, Ld23, Ld26, Ld31, Ld33, Ld35). Chromosome Ld31 was confirmed tetrasomic in all strains, and chromosome Ld23 always trisomic. Other aneuploid chromosomes had a variable somy. In bulk genomic analysis, somy values represent the average of the somy of individual cells present in a given population. Because of mosaicism, i.e. the presence of different karyotypes among individual cells of the parasite population in culture, many somy values were not discrete. The highest somy values (apart from Ld31) were observed for chromosomes Ld02 (somy = 3.9), Ld08 (3.9), and Ld33 (3.5) in BPK275, and for chromosome Ld05 (3.6) in BPK288.

Fig 1. Heatmap showing aneuploidy patterns in the six L. donovani strains.

The transcript abundance of aneuploid chromosomes mirrored almost perfectly their underlying somy (Fig 2, S1A and S1B Fig). Specifically, the average abundance of a transcript encoded by a chromosome was proportional with a factor of 0.94 with the chromosomal somy (Fig 3: 95% CI: 0.92–0.96 and p < 2·10−16, R2: 0.98). The average abundance of the proteins encoded by a chromosome was also highly proportional to the aneuploidy of the chromosome (p < 2 · 10−16), albeit with a decreased slope of 0.76 (95% CI: 0.70–0.77, R2: 0.92). Thus, aneuploidy results in a linear, but not equivalent, increase in protein abundance. Circos plots clearly showed that transcript- and protein abundance of aneuploid chromosomes was affected globally and not restricted to specific chromosomal regions (Fig 2, S1A and S1B Fig).

Fig 2. Comparative Circos plot between BPK275 and BPK178, showing their relative (expressed in fold change) gene dosage (blue), transcript abundance (orange), and protein abundance (purple) across the 36 L. donovani chromosomes.

20 point moving averages are indicated with a black line for the three largest aneuploid chromosomes. We show this strain pair because of the partial chromosomal amplification on chromosome Ld35. However, Circos plots between BPK173 and BPK282, and BHU575 and BPK288 show similar agreement between genome, transcriptome, and proteome and can be found in S1A and S1B Fig).

Fig 3. Impact of chromosome aneuploidy on average chromosome transcript (red) and protein (blue) abundance levels.

Each dot represents 1 chromosome in one of the six studied aneuploid L. donovani strains in the replicative (LOG) phase. Abundance levels were normalized so that two transcript/protein abundance units match the abundance level of a disomic chromosome. Somy values are often not discrete due to mosaic aneuploidy (i.e. presence of different kayotypes among individual cells present in 1 strain) in Leishmania.

Interestingly, chromosome 35 exhibited only a partial amplification in strain BPK275, the first part being disomic (somy = 2.2) and the second part, starting from gene LdBPK_350040900, being trisomic (somy = 3.1) (Fig 2). This phenomenon of partial chromosomal amplification has previously been observed for chromosome 23 as well [45,46]. The difference in both average transcript and protein abundance between the first and the second chromosome part was significant (t-test p transcript = 4.0x10-82, protein = 7.0x10-7), the second part having on average 1.43x more transcript abundance and 1.22x more protein abundance. This demonstrates that also partial chromosomal amplifications affect transcript and protein abundance proportionally in the amplified region.

Aneuploidy does not impact all proteins equally

Our observation that a somy increase does not lead to an equivalent increase in protein abundance, but 0.76x the somy change, suggested either of two possible explanations. A first possibility is that all proteins encoded by an aneuploid chromosome are modulated equally and exhibit only a partial protein abundance increase. We further refer to this explanation as the ‘dampening hypothesis’. Such a general ‘dampening’ effect would suggest an underlying process related to intrinsic mechanisms or limitations in the parasite’s translation system. Alternatively, different proteins could respond differently to increased gene dosage and transcript abundance, pointing to attenuation effects linked to the identity, function, and physicochemical properties of those specific proteins or certain functional protein groups. We refer to this explanation as the ‘attenuated dosage response hypothesis’.

To evaluate both hypotheses, the differential abundance patterns of proteins were compared to their encoding chromosome differences from disomy to trisomy and tetrasomy. Following the ‘dampening hypothesis’, the average of the protein abundance distribution should increase with increased somy, but the distribution shape should remain unaltered. In our study, three chromosomes (Ld05, Ld08, and Ld33) were at least once disomic, trisomic, and tetrasomic in one of our six study strains (Fig 1), allowing for such comparative analysis. Ld02 also matched these criteria but was excluded from the analysis as only nine Ld02 proteins were detected. For reference, the same analysis was conducted for transcripts.

The average abundance of transcripts increased with somy, and the shape of the abundance distribution was found to be similar (Fig 4A). Specifically, the transcript abundance standard deviation did not significantly change with somy (Regression p: 0.10, Fig 4B), and the distributions were symmetric. This suggests that a somy change affects most transcripts equally, and attenuated dosage responses at the transcript level are minimal or absent. Chromosome Ld08 had a higher variance in transcript abundance than the two other chromosomes. Our data is not conclusive as to whether this is due to the shift in somy or due to an intrinsically higher transcript abundance variance for this chromosome.

Fig 4.

A: Normalized transcript (left) and protein (right) abundance distributions of chromosomes Ld05, Ld08, and Ld33 in their disomic (excl. chr 8), trisomic and tetrasomic states, each time observed in one of our 6 study strains. Vertical dotted lines indicate the corresponding somy, derived from the genomic data. The small vertical lines at the base of the plot indicate individual datapoints for transcripts and proteins, from which the distributions were generated. The abundance values of transcripts and proteins were normalized so that the normalized abundance at disomy = 2. This normalization was carried out with a strain that was disomic for the chromosome under investigation. We do not have disomic distributions for Ld08 since Ld08 was exclusively disomic in BPK282, and thus already was used as reference strain. B. Regression between the somy and transcript (left) or protein (right) standard deviation for chromosomes Ld05, Ld08, and Ld33 in different somy states.

The average abundance of proteins also increased with somy, but the distribution shape was not consistent. This was particularly evident when comparing the disomic versus the tetrasomic chromosomes. We observed a clear left-tailing of the protein abundance distribution on chromosomes Ld08 and Ld33 chromosomes towards the lower, more conservative fold change values. Thus, while a subset of proteins follows the aneuploidy-induced gene dosage changes, another subset is maintained at conservatively lower levels. This reduced dosage response was also reflected in a general and significant increase in protein abundance variance for all three chromosomes (Regression p: 0.004).

Proteins with a reduced dosage response are primarily protein complex subunits or non-cytoplasmic

We investigated if specific protein properties were associated with a reduced dosage response in our dataset. For this analysis, we defined a protein’s reduced dosage effect or ‘attenuation level’ as its Log2 transcript fold change (i.e., its transcript level on the aneuploid chromosome divided by its transcript level on disomic chromosome), minus its Log2 protein fold change. A particular challenge in this analysis was that protein fold changes, and hence, the observed attenuation levels, were generally very small and close to measurement noise levels. For example, for a fully attenuated protein (to disomic levels) from a trisomic chromosome, the attenuation level would only be 0.5 Log2FC units (Log2FC transcript ≈ 0.5, Log2FC protein = 0). Therefore, we focused on a set of 85 transcripts and proteins, which had very accurate protein measurements (SD between biological replicates < 0.25 Log2FC units). These covered a broad range of attenuation levels ranging from 1.18 Log2FC units, to values close to zero, or even some proteins that had a slight overamplification compared to the transcript level (-0.40 Log2FC units) (S1E Table).

We noted that within the top 5 attenuated proteins in this high-accuracy set, 2 coded for subunits of protein complexes: i.e., the cleavage and Polyadenylation Specificity Factor (CPSF) subunit A and the dynein light chain had respectively 1.12 and 0.96 Log2FC units lower protein than transcript abundance. Accordingly, we hypothesized that protein complexes might be more likely to be attenuated (> = 0.30 Log2FC units). To test this systematically, all 85 proteins were screened for their involvement in heteromeric protein complexes (using annotation keywords ‘subunit’ and ‘chain’, S1E Table). Strikingly, complex subunits accounted for 15.6% (5 out of 32) of the attenuated proteins which was significantly more than the 1.8% (1 out of 53) we observed for the unattenuated proteins (Fischer exact p = 0.0265).

Next, we looked if there was a relation between a proteins’ attenuation level and a set of biological and physicochemical protein properties to identify more subtle patterns. Specifically, we checked 5 protein properties for their relationship with protein attenuation using AN(C)OVA (S1E Table). Transcript abundance was added as a covariate to the model, as attenuation was larger at higher differential transcript levels (Adj p = 2.4x10-4). The variables ‘membrane protein’ (yes or no) (FDR adjusted p = 0.696), the number of protein interactions (Adj. p = 1), protein length (a proxy for transcript and protein size) (Adj. p = 1) and hydrophobicity (Adj. p = 1) had no significant relationship with a protein’s attenuation level. In contrast, the subcellular location (Adj. p = 0.032) of a protein was significantly linked to its degree of dosage attenuation. To investigate this further, we investigated the relation between each subcellular compartment and the attenuation level of its proteins (S1E Table). Cytoplasmatic proteins (n = 39) were significantly less attenuated compared to all other proteins (Adj. p = 0.011). Thus, cytoplasmatic proteins follow aneuploidy-induced dosage changes more closely than proteins from other cellular compartments. In contrast, proteins associated with the Golgi apparatus (n = 2) had a significantly higher attenuation levels than all other proteins (Adj. p = 0.0065). However, this result needs to be interpreted with caution as the sample size was only 2 proteins. These two proteins were a vesicle-associated membrane protein and a small GTP-binding protein Rab18, both known to be secreted. Proteins from mitochondria (Adj. p = 0.32, n = 18), nucleus (Adj. p = 0.50, n = 13), peroxisome (Adj. p = 0.81, n = 6) and ER (Adj. p = 0.74, n = 2) did not have significantly attenuation levels compared to proteins from all other compartments.

The impact of aneuploid chromosomes on the gene products of their euploid counterparts

In the previous sections, we reported how the aneuploidy of a chromosome affects the abundance of its encoded transcripts and proteins. However, through perturbation of regulatory networks, these changes might also affect transcripts and proteins encoded by genomic regions unaffected by aneuploidy. These were previously defined as ‘trans-effects’ and have been observed on the transcriptional level in diverse species, including Arabidopsis and Drosophila, as well as in humans with sex chromosome aneuploidies, Turner or Klinefelter syndrome [4750]. Here we investigated the extent of trans-effects in Leishmania by comparing and characterizing the fractions of differential transcripts and proteins coming from aneuploid chromosomes, i.e. primary or cis-transcripts and cis-proteins, versus those coming from euploid chromosomes, i.e. trans-transcripts and trans-proteins.

We carried out differential transcript and protein abundance analyses (FDR adjusted p < 0.05) between all possible pairs of the six L. donovani study strains (Table 1). On average, 71.8% of differentially expressed transcripts were cis-transcripts, while cis-proteins constituted only 40.7%. The remaining 28.2% of differentially expressed transcripts and 59.3% of proteins were thus encoded by euploid chromosomes (trans-) and have therefore been regulated by another mechanism than gene dosage. Interestingly, only 20.0% of these trans-proteins had a corresponding transcript change (|Log2FC| > 0.5), indicating the vast majority is regulated at protein level directly and not through modulation of mRNA levels. Importantly, none of these trans-transcripts nor trans-proteins were associated with local copy number variants: these were rare and excluded from this analysis. Also sequence variation between strain pairs is unlikely to have resulted in the observed proportions of differential trans-transcripts and trans-proteins. Only 11 trans-transcripts contained SNPs between our strains (6 missense, 1 synonymous and 4 upstream SNPs) and 2 proteins (1 upstream and 1 downstream SNP) (S1G and S1H Table). Therefore, our hypothesis is that aneuploidy induces direct changes to cis-transcripts and cis-proteins, which directly (e.g., protein-protein interaction) or indirectly drive abundance changes in trans-transcripts and mostly, trans-proteins.

Table 1. Total numbers (N) of differential (FDR adjusted p < 0.05) transcripts and proteins between all possible pairs of our 6 L.donovani strains.

Each time, it is indicated what proportion of these transcripts or proteins originate from chromosomes that have somy changes (because of aneuploidy) in the strain pair (Cis) or chromosomes that have identical somies between the two strains (Trans). Five comparisons are not shown because they contained less than 15 differential transcripts or proteins. These are shown in S1F Table.

Next, we looked deeper into the functions of trans-transcripts and proteins. After pooling the trans-transcripts and trans-proteins from all comparisons, we identified a set of 541 unique trans-transcripts and 203 unique trans-proteins (duplicates and hypothetical proteins removed, S1G and S1H Table). GO-overrepresentation analysis showed that no GO terms were significantly enriched in the pooled list of trans-transcripts or trans-proteins from all comparisons (FDR adjusted p threshold 0.05). However, it must be noted here that GO- and other annotations in Leishmania are often relatively unspecific due to the parasite’s distinct evolutionary position, and therefore low gene homology with better studied model organisms. For example, only 48.7% of the L. donovani genes could be annotated with at least one GO term, and even these are often of the highest hierarchies (low specificity)12. As an alternative enrichment analysis approach using the full genomic annotations directly, we first manually checked if any frequent keywords could be observed occurring in annotations of the trans-proteins and identified 10. We then compared if these keywords were significantly enriched versus all detected proteins with a Fisher Exact test. 7 of these were indeed significantly enriched of which 3 related to protein metabolism: peptidase activity (FDR adjusted p = 9.4 x 10−5), heat-shock protein (Adj. p = 2.3 x 10−2) and chaperon(in) (Adj. p = 2.1 x 10−2), 2 related to metabolism: glutamate (Adj. p = 1.9 x 10−2) and long-chain-fatty-acid-CoA ligase (Adj. p = 5.4 x 10−3), and two others: mitochondrial (Adj. p = 3.0 x 10−2) and hypothetical protein (Adj. p = 3.4 x 10−7). At RNA-level none of the keywords were significant. This demonstrates that differential trans-proteins associated with aneuploidy are not a random occurrence, but related to specific functional classes and processes. Particularly relevant in this context are the enrichment of heat-shock proteins and chaperones, as aneuploidy is known to burden protein folding pathways [51].

Aneuploidy globally affects transcriptome and proteome regardless of life stage

So far, this study discussed results from the logarithmic growth stage (LOG), which essentially consists of Leishmania procyclic promastigotes (replicating, non-infectious). However, despite the absence of transcriptional regulation of individual protein-coding genes, transcripts and proteins can undergo post-transcriptional regulation during parasite differentiation. This raised the question whether the correlation between chromosome somy, average transcript abundance, and average protein abundance is also present in other life stages. Hence, for three strains, we compared the LOG growth with the STAT growth phase. STAT is highly enriched for metacyclic promastigotes, which are non-replicating, have a distinct morphology, are metabolically different, and infectious to the human host [52]. This was validated with metacyclogenesis markers META1 and HASPB, which were significantly upregulated in STAT as they are expected to be (Adj. p < 10−13, S1H Table). Transcriptome and proteome measurements and analyses were carried out exactly as for the LOG phase.

For both LOG and STAT, a chromosome’s somy was proportional to its average transcript expression level, and protein expression level (Fig 5, p < 2 * 10−16 for transcript and protein in both LOG and STAT). A reduced effect of somy on protein abundance was apparent at the protein level, in both LOG and STAT growth phases. This suggests that the gene dosage effects associated with aneuploidy remain present throughout the parasite’s life cycle.

Fig 5. Impact of chromosome aneuploidy on average chromosome transcript (red) and protein (blue) output levels in LOG phase (A) and STAT phase (B).

Each dot represents 1 chromosome in one of the 3 studied aneuploid L. donovani strains (BPK275, BPK178, and BPK173).

Aneuploidy results in metabolic differences between L. donovani strains

We ultimately aimed to understand if aneuploidy in L. donovani is reflected in the parasite’s metabolome. Therefore, we carried out LC-MS metabolomics and confidently identified 78 metabolites across the six aneuploid study strains (S5 Table). In contrast to transcripts and proteins, there is no straightforward relation between a metabolite and a specific gene. Genes might affect metabolites directly (e.g. an enzyme), indirectly, affect multiple metabolites, or affect no metabolite at all. Therefore, we turned to an indirect strategy to integrate the metabolomic data with our previous observations.

We determined if differences in aneuploidy between strains in LOG correlate with their metabolic differences. As such, we calculated the pairwise genomic (somy), transcriptomic, proteomic, and metabolic Euclidean distance (based on all measured chromosomes, transcripts, proteins and metabolites) between each possible strain pair, and correlated these distances (Fig 6). The larger the distance between these 2 strains, the more different their respective genomes, transcriptomes, proteomes, or metabolomes are from one another. Consistent with our other analyses, we found strong correlations between the genomic and transcriptomic distance, between the transcriptomic and proteomic distance, and between the genomic and proteomic distance (p ≤ 0.001 for all comparisons). However, the metabolome initially did not show any correlation with genome, transcriptome, and proteome. We suspected this might be caused by the high intrinsic levels of random variation of a subset of metabolites (by either technical or biological causes). Therefore, in the second phase of the analysis, we filtered on metabolites that had at least once a significant fold change (adjusted p < 0.05) between any pair of two strains. This strategy selects only metabolites that are stable enough to reflect true differences in metabolite levels between strains. The resulting ‘differentiating metabolome’ consisting of 34 metabolites correlated strongly with all omic layers (p < 0.05, Fig 6). This demonstrates that the degree of karyotype difference between two strains correlates with their degree of metabolic differences. Specifically, the differentiating metabolome included 9 amino acids (alanine, aspartate, cysteine, leucine/isoleucine (not distinguishable), methionine, phenylalanine, tryptophan, tyrosine, and valine), suggesting the amino acid metabolism is an important molecular difference between these aneuploid parasites.

Fig 6. How differences between two L. donovani strains on 1 ‘omic layer correlate with their differences on another omic layer.

The plots show the pairwise correlations between the genome distance (aneuploidy), transcriptome distance, proteome distance, metabolome distance (all 78 metabolites), and differentiating metabolome distance (34 metabolites), each time calculated between 2 L. donovani strains. Each dot represents 1 comparison between two strains. With 6 strains in our experiment, there were 15 possible pairwise comparisons. The distance metric that was used is the Euclidean distance. Somy values were used directly, but transcript abundance, protein abundance or metabolite abundance were first Z-scored before calculating the distance. The top-right panels show the Pearson correlation coefficients with * = Pval <0.05, ** Pval <0.01, *** = Pval <0.001, while the bottom-left panels show the linear regressions.


Leishmania displays a remarkably high tolerance for aneuploidy, with any chromosome having the capacity to become polysomic and produce a viable parasite. The high karyotype dynamicity and the drastic karyotype changes between experimental conditions stress the central importance of aneuploidy for the parasite to adapt to novel environments [11]. However, it remained unclear how this large genomic unstability affects the parasite as a system. Moreover, Leishmania’s distinct phylogenetic status and unique gene expression system disallows extrapolation from model systems. Therefore, we investigated the systemic impact of aneuploidy in L. donovani by integrating, for the first time, genome, transcriptome, proteome, and metabolome profiling data of highly aneuploid strains. Our study offers a comprehensive picture of the molecular changes associated with aneuploidy in this unique Eukaryotic model that lacks transcriptional regulation of individual protein-coding genes.

We confirmed that transcript abundance is affected across the entirety of aneuploid chromosomes. This observation is in line with previous work that has also found a strong link between aneuploidy-induced gene dosage changes and transcript abundance in Leishmania [12,17,53]. We further showed that this relation is almost equivalent, with transcription levels closely mirroring chromosome copy number changes (transcript abundance = 0.94x the somy). It is important to note here that to establish this 0.94x equivalence, we compared the same chromosome to its own disomic state as the reference transcript expression level at disomy. This does not exclude that chromosomes (or, on a smaller scale, cistrons) can have inherently different transcription levels. Indeed, it has previously been shown that chromosome 31 has the expression of a disomic chromosome in its default tetratomic state17. The impact of aneuploidy on the L. donovani proteome was also linear and chromosome-wide, but not equivalent. For every chromosome copy, the protein abundance levels of that chromosome increased (on average) by a factor 0.76x the somy.

This reduced impact of gene dosage on the protein level was due to the attenuation of a subset of proteins. While many proteins did follow the aneuploidy-induced gene dosage changes, a group of proteins showed a reduced dosage response. This reduction ranged from partial to complete attenuation to disomic levels. We made two observations concerning the identity of attenuated proteins that might explain this attenuation. Firstly, attenuated proteins were enriched for subunits of macromolecular protein complexes (15.6% versus 1.8% in the background set). This observation matches several observations in aneuploid yeast and human (including cancer) cell lines [5459] that also detected protein complex subunits to be highly prevalent amongst attenuated proteins. It can be explained by the fact that individual subunits of macromolecular complexes are typically unstable unless assembled into a stable complex [57]. As subunits of a single complex are often encoded on different chromosomes and produced in balanced, stoichiometric amounts, an increase in an individual subunit (here: by aneuploidy) will often not lead to more complex formation [55,57,60]. Instead, the protein will not have an available binding partner(s), remains unstable, and rapidly degrades. It is possible that this stochiometric imbalance caused by aneuploidy is to some extent detrimental to the parasite. One argument in this direction is that we and previous studies found smaller Leishmania chromosomes (with smaller numbers of genes) to be more frequently aneuploid, which matches observation in yeast and humans (chromosome 21 has the smallest number of genes in humans and the least severe autosomal aneuploid) [11,24,61]. Additionally, it has been shown in yeast that genes that are deleterious when overexpressed are enriched for protein complex subunits [62]. From this perspective, the situation in Leishmania could be similar to other single-celled organisms such as yeast, where the fitness benefits of aneuploidy must outweigh the fitness costs in a given environment [62].

Secondly, we observed a correlation between the degree of attenuation and the cellular destination of the attenuated protein. Two secreted proteins associated with the Golgi apparatus were the most attenuated (i.e. their expression was close to disomic levels), which could be explained by the fact that secreted proteins do not accumulate in the cell. Correspondingly, attenuation of secreted proteins was also observed in a recent study of human fibroblasts with trisomy 21 [57]. Proteins from mitochondria, the nucleus, and the peroxisome were also significantly more attenuated than cytoplasmatic proteins, which had the lowest level of attenuation (i.e., cytoplasmic proteins followed gene dosage the best). A potential explanation might be that mitochondrial, nucleic, and peroxisome proteins require active transport into their destined organelle. This transport requires extensive interaction with other proteins, potentially explaining why their abundance scales less with increased protein production. In aneuploid (disomic II) strains of S. cerevisiae, increased translation was found of genes related to protein trafficking pathways of the endomembrane system and nucleus, also suggesting dependency on this system [63].

Partial chromosomal amplification can occur in Leishmania. We have previously observed this phenomenon for chromosome Ld23 [45,46] and in the present study for Ld35. Interestingly, gene dosage proportionally affected transcript and protein abundance for these subchromosomal amplification as well. This suggests that our findings can be extrapolated to segmental aneuploidy as well.

Further, we demonstrated that these gene dosage changes remain important throughout at least several life stages of the parasite. Indeed, we observed an similar impact of gene dosage on transcriptome and proteome in either LOG (enriched for procyclic promastigotes, proliferative stage) or STAT (enriched for metacyclic promastigotes, infectious stage) growth phases. This is striking, as the parasite featuring extensive transcriptome and proteome changes during its life cycle. For example, transcriptomics studies have shown that up to 30–50% of transcripts are differentially expressed between procyclic and metacyclic promastigotes [16,64]. Similarly, 10–40% of proteins are differentially expressed between the parasite’s promastigote and amastigote life stages [12,65].

A limited proportion of differential transcripts between aneuploid strains (23.6%) and the majority of differential proteins (56.9%) originated from chromosomes without a change in somy, so-called trans-transcripts, and trans-proteins. As we excluded genes with CNVS between strains from our analysis and SNPs and INDELs were very rare, we hypothesize that the observed trans-effects could result from aneuploidy, although our study does not allow to test for a causal relationship. High abundances of trans-transcripts have been reported in many other aneuploidy studies, including on Drosophila and Arabidopsis, as well as in humans with sex chromosome aneuploidies, Turner or Klinefelter syndrome [4750]. These trans-transcriptomes were typically enriched for transcription factors, genes related to metabolic processes, protein metabolism, or cellular stress responses. It supports the idea that the cellular system mitigates the primary impact of aneuploidy by inducing transcriptional changes to trans-genes. Here, in Leishmania, we found relatively low proportions of differential trans-transcripts and no significant enrichment of specific functional classes amongst these. While Leishmania lacks transcriptional regulation of individual protein-coding genes, a more extensive regulation would have been a theoretical possibility, as post-transcriptional regulation of individual transcripts (eg. by alba-domain proteins) was observed in the context of parasite development [66]. Instead, we did find a large proportion of trans-proteins. It is possible that the abundances of these trans-proteins are modulated at the translation or protein-stability level directly, as the majority did not have underlying transcript-level changes. Interestingly, these trans-proteins were significantly enriched for specific protein groups, including protein metabolism-related chaperones and chaperonins, peptidases, and heat-shock proteins. This is relevant as aneuploidy is known to put an extra burden on the protein-folding system [51].

Finally, we showed that the degree of aneuploidy differences between strains matched the degree of transcriptome, proteome, and differential metabolome changes. This is an important observation as it suggests that aneuploidy in Leishmania has the capacity to drive metabolic variation, which is closely linked to the phenotype. It also matches our previous observations where we linked L. donovani gene copy number variants directly to changes in metabolic pathways [32]. Ultimately, it supports the view that aneuploidy can be adaptive and drive the metabolic changes that the parasite needs to survive in a novel environment.


In summary, our study shows that aneuploidy in Leishmania globally and proportionally impacts the transcriptome and proteome of affected chromosomes throughout the parasite’s life cycle in vitro. In Leishmania strains with a closely related genetic background, the degree of these aneuploidy-induced dose changes ultimately correlates with the degree of metabolomic difference. This further supports the view that aneuploidy in Leishmania can be adaptive. As in other Eukaryotes, we observed reduced dosage effects at protein level for protein-complex subunits and secreted proteins. Additionally, we found non-cytoplasmic proteins to respond less to dosage changes than cytoplasmic ones. Differentially expressed transcripts and proteins between aneuploid Leishmania strains also originated from non-aneuploid chromosomes. At protein level, these were enriched for proteins involved in protein-metabolism, including chaperones and chaperonins, peptidases, and heat-shock proteins. We believe that the high karyotype diversity in vitro and absence of classical transcriptional regulation make Leishmania an attractive model to study processes of protein homeostasis in the context of aneuploidy and beyond.

Supporting information

S1 Table.

S1A Table. Detailed description of the L. donovani isolates used in this study. S1B Table. Number of total SNPs, non-synonmous SNPs and homozygous non-synonymous SNPs between any pair of L. donovani strains in this study. S1C Table. Sequencing quality and mapping statistics of all 6 sequenced L. donovani genomes in this study. S1D Table. Sequencing quality and mapping statistics of all sequenced L. donovani transcriptomes in this study. S1E Table. Protein attenuation model data, coefficients and p-values. S1F Table. Summary of all cis- and trans- transcripts and proteins of all possible pairwise comparisons between the 6 L. donovani strains in this study. S1G Table. Pooled trans-transcripts from all possible pairwise comparisons between the 6 L. donovani strains in this study. S1H Table. Pooled trans-proteins from all possible pairwise comparisons between the 6 L. donovani strains in this study. S1I Table. Upregulated metacyclogenis markers in STAT growth phase. Output generated by DESeq2.


S2 Table. Structure (Somy and local CNVs) of all sequenced L. donovani genomes in this study.


S3 Table. Differential RNA abundance analyses of the different biological comparisons made in this study.


S4 Table. Detailed statistics about all proteins detected in this study.

Output generated by MaxQuant.


S5 Table. Detailed statistics about all metabolites detected in this study.


S1 Fig. S1A Fig and S1B Fig.

Comparative Circos plot between BPK173 and BPK288 (S1A) and between BHU575 and BPK282 (S1B), showing their relative (expressed in fold change) gene dosage (blue), transcript abundance (orange), and protein abundance (purple) across the 36 L. donovani chromosomes.



We thank the Center of Medical Genetics at the University of Antwerp for hosting the NGS facility on which our RNA-Seq experiments were performed. The computational resources used for this work were provided by the VSC (Flemish Supercomputer Center) at the University of Antwerp.


  1. 1. Zhu J, Tsai H-J, Gordon MR, Li R. Cellular Stress Associated with Aneuploidy. Developmental cell. 2018;44(4):420–31. pmid:29486194.
  2. 2. Birchler JA, Veitia RA. One Hundred Years of Gene Balance: How Stoichiometric Issues Affect Gene Expression, Genome Evolution, and Quantitative Traits. Cytogenet Genome Res. 2021;161(10–11):529–50. Epub 20211123. pmid:34814143.
  3. 3. Siegel JJ, Amon A. New Insights into the Troubles of Aneuploidy. Annual review of cell and developmental biology. 2012;28:189–214. pmid:22804579
  4. 4. Chen G, Bradford WD, Seidel CW, Li R. Hsp90 stress potentiates rapid cellular adaptation through induction of aneuploidy. Nature. 2012;482:246. pmid:22286062
  5. 5. Selmecki A, Forche A, Berman J. Genomic plasticity of the human fungal pathogen Candida albicans. Eukaryotic cell. 2010;9(7):991–1008. Epub 2010/05/25. pmid:20495058
  6. 6. Gresham D, Desai MM, Tucker CM, Jenq HT, Pai DA, Ward A, et al. The repertoire and dynamics of evolutionary adaptations to controlled nutrient-limited environments in yeast. PLoS genetics. 2008;4(12):e1000303. Epub 2008/12/17. pmid:19079573
  7. 7. Gerstein AC, Fu MS, Mukaremera L, Li Z, Ormerod KL, Fraser JA, et al. Polyploid titan cells produce haploid and aneuploid progeny to promote stress adaptation. MBio. 2015;6(5):e01340–15. Epub 2015/10/16. pmid:26463162
  8. 8. Seton-Rogers S. Fitness penalties of aneuploidy. Nature Reviews Cancer. 2017;17(3):142–3. pmid:28184042
  9. 9. Rancati G, Pavelka N, Fleharty B, Noll A, Trimble R, Walton K, et al. Aneuploidy underlies rapid adaptive evolution of yeast cells deprived of a conserved cytokinesis motor. Cell. 2008;135(5):879–93. Epub 2008/12/02. pmid:19041751
  10. 10. Sunshine AB, Payen C, Ong GT, Liachko I, Tan KM, Dunham MJ. The Fitness Consequences of Aneuploidy Are Driven by Condition-Dependent Gene Effects. PLOS Biology. 2015;13(5):e1002155. pmid:26011532
  11. 11. Imamura H, Downing T, Van den Broeck F, Sanders MJ, Rijal S, Sundar S, et al. Evolutionary genomics of epidemic visceral leishmaniasis in the Indian subcontinent. Elife. 2016;5:e12613. pmid:27003289
  12. 12. Dumetz F, Imamura H, Sanders M, Seblova V, Myskova J, Pescher P, et al. Modulation of Aneuploidy in Leishmania donovani during Adaptation to Different In Vitro and In Vivo Environments and Its Impact on Gene Expression. mBio. 2017;8(3):e00599–17. pmid:28536289
  13. 13. Franssen SU, Durrant C, Stark O, Moser B, Downing T, Imamura H, et al. Global genome diversity of the Leishmania donovani complex. eLife. 2020;9:e51243. pmid:32209228
  14. 14. Sterkers Y, Lachaud L, Crobu L, Bastien P, Pagès M. FISH analysis reveals aneuploidy and continual generation of chromosomal mosaicism in Leishmania major. Cell Microbiol. 2011;13(2):274–83. Epub 20101021. pmid:20964798.
  15. 15. Ivens AC, Peacock CS, Worthey EA, Murphy L, Aggarwal G, Berriman M, et al. The genome of the kinetoplastid parasite, Leishmania major. Science (New York, NY). 2005;309(5733):436–42. Epub 2005/07/16. pmid:16020728
  16. 16. Cuypers B, Domagalska MA, Meysman P, Muylder Gd, Vanaerschot M, Imamura H, et al. Multiplexed Spliced-Leader Sequencing: A high-throughput, selective method for RNA-seq in Trypanosomatids. Scientific Reports. 2017;7(1):3725. pmid:28623350
  17. 17. Prieto Barja P, Pescher P, Bussotti G, Dumetz F, Imamura H, Kedra D, et al. Haplotype selection as an adaptive mechanism in the protozoan pathogen Leishmania donovani. Nature ecology & evolution. 2017. Epub 2017/11/08. pmid:29109466.
  18. 18. WHO. Proposals for the nomenclature of salivarian trypanosomes and for the maintenance of reference collections*. Bulletin of the World Health Organization. 1978;56(3):467–80. pmid:308413
  19. 19. Andrews S. FastQC: a quality control tool for high throughput sequence data. 2010.
  20. 20. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics (Oxford, England). 2014;30(15):2114–20. Epub 2014/04/04. pmid:24695404
  21. 21. Dumetz F, Imamura H, Sanders M, Seblova V, Myskova J, Pescher P, et al. Modulation of Aneuploidy in Leishmania donovani during Adaptation to Different In Vitro and In Vivo Environments and Its Impact on Gene Expression. MBio. 2017;8(3). Epub 2017/05/26. pmid:28536289.
  22. 22. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint arXiv:13033997. 2013.
  23. 23. Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics (Oxford, England). 2011;27(21):2987–93. Epub 2011/09/10. pmid:21903627
  24. 24. Downing T, Imamura H, Decuypere S, Clark TG, Coombs GH, Cotton JA, et al. Whole genome sequencing of multiple Leishmania donovani clinical isolates provides insights into population structure and mechanisms of drug resistance. Genome research. 2011;21(12):2143–56. Epub 2011/11/01. pmid:22038251
  25. 25. Anders S, Pyl PT, Huber W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics (Oxford, England). 2015;31(2):166–9. Epub 2014/09/28. pmid:25260700
  26. 26. Quinn TP, Erb I, Richardson MF, Crowley TM. Understanding sequencing data as compositions: an outlook and review. Bioinformatics (Oxford, England). 2018;34(16):2870–8. pmid:29608657
  27. 27. Tyanova S, Temu T, Cox J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nature Protocols. 2016;11(12):2301–19. pmid:27809316
  28. 28. Berg M, García-Hernández R, Cuypers B, Vanaerschot M, Manzano JI, Poveda JA, et al. Experimental resistance to drug combinations in Leishmania donovani: metabolic and phenotypic adaptations. Antimicrobial agents and chemotherapy. 2015;59(4):2242–55. pmid:25645828
  29. 29. Berg M, Vanaerschot M, Jankevics A, Cuypers B, Maes I, Mukherjee S, et al. Metabolic adaptations of Leishmania donovani in relation to differentiation, drug resistance, and drug pressure. Mol Microbiol. 2013;90(2):428–42. Epub 2013/09/12. pmid:24020363.
  30. 30. Benton HP, Want EJ, Ebbels TM. Correction of mass calibration gaps in liquid chromatography-mass spectrometry metabolomics data. Bioinformatics (Oxford, England). 2010;26(19):2488–9. Epub 2010/07/31. pmid:20671148.
  31. 31. Scheltema RA, Jankevics A, Jansen RC, Swertz MA, Breitling R. PeakML/mzMatch: a file format, Java library, R library, and tool-chain for mass spectrometry data analysis. Analytical chemistry. 2011;83(7):2786–93. Epub 2011/03/16. pmid:21401061.
  32. 32. Cuypers B, Berg M, Imamura H, Dumetz F, De Muylder G, Domagalska MA, et al. Integrated genomic and metabolomic profiling of ISC1, an emerging Leishmania donovani population in the Indian subcontinent. Infection, Genetics and Evolution. 2018;62:170–8. pmid:29679745
  33. 33. Prince JT, Marcotte EM. Chromatographic alignment of ESI-LC-MS proteomics data sets by ordered bijective interpolated warping. Anal Chem. 2006;78(17):6140–52. Epub 2006/09/02. pmid:16944896.
  34. 34. Tautenhahn R, Böttcher C, Neumann S. Highly sensitive feature detection for high resolution LC/MS. BMC Bioinformatics. 2008;9(1):504. pmid:19040729
  35. 35. Doyle MA, MacRae JI, De Souza DP, Saunders EC, McConville MJ, Likić VA. LeishCyc: a biochemical pathways database for Leishmania major. BMC Syst Biol. 2009;3:57-. pmid:19497128.
  36. 36. Fahy E, Subramaniam S, Murphy RC, Nishijima M, Raetz CRH, Shimizu T, et al. Update of the LIPID MAPS comprehensive classification system for lipids. J Lipid Res. 2009;50 Suppl(Suppl):S9–S14. Epub 2008/12/19. pmid:19098281.
  37. 37. Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic acids research. 2017;45(D1):D353–d61. Epub 2016/12/03. pmid:27899662
  38. 38. Wishart DS, Feunang YD, Marcu A, Guo AC, Liang K, Vázquez-Fresno R, et al. HMDB 4.0: the human metabolome database for 2018. Nucleic acids research. 2018;46(D1):D608–d17. Epub 2017/11/16. pmid:29140435
  39. 39. Tihon E, Imamura H, Van den Broeck F, Vermeiren L, Dujardin J-C, Van Den Abbeele J. Genomic analysis of Isometamidium Chloride resistance in Trypanosoma congolense. International Journal for Parasitology: Drugs and Drug Resistance. 2017;7(3):350–61.
  40. 40. Wickham H. ggplot2: Elegant Graphics for Data Analysis: Springer-Verlag New York; 2016.
  41. 41. Krzywinski MI, Schein JE, Birol I, Connors J, Gascoyne R, Horsman D, et al. Circos: An information aesthetic for comparative genomics. Genome research. 2009. pmid:19541911
  42. 42. Hunter JD. Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering. 2007;9(3):90–5.
  43. 43. Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature Methods. 2020;17(3):261–72. pmid:32015543
  44. 44. Imamura H, Downing T, Van den Broeck F, Sanders MJ, Rijal S, Sundar S, et al. Evolutionary genomics of epidemic visceral leishmaniasis in the Indian subcontinent. Elife. 2016;5. Epub 2016/03/24. pmid:27003289
  45. 45. Dumetz F, Cuypers B, Imamura H, Zander D, D’Haenens E, Maes I, et al. Molecular Preadaptation to Antimony Resistance in Leishmania donovani on the Indian Subcontinent. mSphere. 2018;3(2):e00548–17. pmid:29669889
  46. 46. Laffitte M-CN, Genois M-M, Mukherjee A, Légaré D, Masson J-Y, Ouellette M. Formation of Linear Amplicons with Inverted Duplications in Leishmania Requires the MRE11 Nuclease. PLoS genetics. 2014;10(12):e1004805. pmid:25474106
  47. 47. Hou J, Shi X, Chen C, Islam MS, Johnson AF, Kanno T, et al. Global impacts of chromosomal imbalance on gene expression in Arabidopsis and other taxa. Proc Natl Acad Sci U S A. 2018;115(48):E11321–e30. Epub 20181114. pmid:30429332
  48. 48. Raznahan A, Parikshak NN, Chandran V, Blumenthal JD, Clasen LS, Alexander-Bloch AF, et al. Sex-chromosome dosage effects on gene expression in humans. Proceedings of the National Academy of Sciences. 2018;115(28):7398–403. pmid:29946024
  49. 49. Sun L, Johnson AF, Donohue RC, Li J, Cheng J, Birchler JA. Dosage compensation and inverse effects in triple X metafemales of Drosophila. Proceedings of the National Academy of Sciences. 2013;110(18):7383–8. pmid:23589863
  50. 50. Zhang X, Hong D, Ma S, Ward T, Ho M, Pattni R, et al. Integrated functional genomic analyses of Klinefelter and Turner syndromes reveal global network effects of altered X chromosome dosage. Proceedings of the National Academy of Sciences. 2020;117(9):4864–73. pmid:32071206
  51. 51. Sheltzer JM, Torres EM, Dunham MJ, Amon A. Transcriptional consequences of aneuploidy. Proc Natl Acad Sci U S A. 2012;109(31):12644–9. Epub 2012/07/16. pmid:22802626.
  52. 52. Berg M, Vanaerschot M, Jankevics A, Cuypers B, Maes I, Mukherjee S, et al. Metabolic adaptations of Leishmania donovani in relation to differentiation, drug resistance, and drug pressure. Molecular Microbiology. 2013;90(2):428–42. pmid:24020363
  53. 53. Iantorno SA, Durrant C, Khan A, Sanders MJ, Beverley SM, Warren WC, et al. Gene Expression in Leishmania Is Regulated Predominantly by Gene Dosage. mBio. 2017;8(5). pmid:28900023
  54. 54. Dephoure N, Hwang S, O’Sullivan C, Dodgson SE, Gygi SP, Amon A, et al. Quantitative proteomic analysis reveals posttranslational responses to aneuploidy in yeast. eLife. 2014;3:e03023. pmid:25073701
  55. 55. Torres EM, Sokolsky T, Tucker CM, Chan LY, Boselli M, Dunham MJ, et al. Effects of Aneuploidy on Cellular Physiology and Cell Division in Haploid Yeast. Science. 2007;317(5840):916–24. pmid:17702937
  56. 56. Stingele S, Stoehr G, Peplowska K, Cox J, Mann M, Storchova Z. Global analysis of genome, transcriptome and proteome reveals the response to aneuploidy in human cells. Mol Syst Biol. 2012;8:608. Epub 2012/09/13. pmid:22968442
  57. 57. Hwang S, Cavaliere P, Li R, Zhu LJ, Dephoure N, Torres EM. Consequences of aneuploidy in human fibroblasts with trisomy 21. Proceedings of the National Academy of Sciences. 2021;118(6):e2014723118. pmid:33526671
  58. 58. McShane E, Sin C, Zauber H, Wells JN, Donnelly N, Wang X, et al. Kinetic Analysis of Protein Stability Reveals Age-Dependent Degradation. Cell. 2016;167(3):803–15.e21. Epub 20161006. pmid:27720452.
  59. 59. Schukken KM, Sheltzer JM. Extensive protein dosage compensation in aneuploid human cancers. bioRxiv. 2021:2021.06.18.449005.
  60. 60. Veitia RA, Bottani S, Birchler JA. Cellular reactions to gene dosage imbalance: genomic, transcriptomic and proteomic effects. Trends in Genetics. 2008;24(8):390–7. pmid:18585818
  61. 61. Gilchrist C, Stelkens R. Aneuploidy in yeast: Segregation error or adaptation mechanism? Yeast. 2019;36(9):525–39. pmid:31199875.
  62. 62. Robinson D, Place M, Hose J, Jochem A, Gasch AP. Natural variation in the consequences of gene overexpression and its implications for evolutionary trajectories. eLife. 2021;10:e70564. pmid:34338637
  63. 63. Larrimore KE, Barattin-Voynova NS, Reid DW, Ng DTW. Aneuploidy-induced proteotoxic stress can be effectively tolerated without dosage compensation, genetic mutations, or stress responses. BMC Biology. 2020;18(1):117. pmid:32900371
  64. 64. Dillon LAL, Okrah K, Hughitt VK, Suresh R, Li Y, Fernandes MC, et al. Transcriptomic profiling of gene expression and RNA processing during Leishmania major differentiation. Nucleic acids research. 2015;43(14):6799–813. Epub 2015/07/06. pmid:26150419.
  65. 65. Lynn MA, Marr AK, McMaster WR. Differential quantitative proteomic profiling of Leishmania infantum and Leishmania mexicana density gradient separated membranous fractions. J Proteomics. 2013;82:179–92. Epub 2013/03/08. pmid:23466312.
  66. 66. Dupé A, Dumas C, Papadopoulou B. An Alba-domain protein contributes to the stage-regulated stability of amastin transcripts in Leishmania. Mol Microbiol. 2014;91(3):548–61. Epub 20131219. pmid:24620725.
  67. 67. Haug K, Cochrane K, Nainala VC, Williams M, Chang J, Jayaseelan KV, et al. MetaboLights: a resource evolving in response to the needs of its scientific community. Nucleic acids research. 2020;48(D1):D440–d4. pmid:31691833