Understanding the Pathogenicity of Burkholderia contaminans, an Emerging Pathogen in Cystic Fibrosis

Several bacterial species from the Burkholderia cepacia complex (Bcc) are feared opportunistic pathogens that lead to debilitating lung infections with a high risk of developing fatal septicemia in cystic fibrosis (CF) patients. However, the pathogenic potential of other Bcc species is yet unknown. To elucidate clinical relevance of Burkholderia contaminans, a species frequently isolated from CF respiratory samples in Ibero-American countries, we aimed to identify its key virulence factors possibly linked with an unfavorable clinical outcome. We performed a genome-wide comparative analysis of two isolates of B. contaminans ST872 from sputum and blood culture of a female CF patient in Argentina. RNA-seq data showed significant changes in expression for quorum sensing-regulated virulence factors and motility and chemotaxis. Furthermore, we detected expression changes in a recently described low-oxygen-activated (lxa) locus which encodes stress-related proteins, and for two clusters responsible for the biosynthesis of antifungal and hemolytic compounds pyrrolnitrin and occidiofungin. Based on phenotypic assays that confirmed changes in motility and in proteolytic, hemolytic and antifungal activities, we were able to distinguish two phenotypes of B. contaminans that coexisted in the host and entered her bloodstream. Whole genome sequencing revealed that the sputum and bloodstream isolates (each representing a distinct phenotype) differed by over 1,400 mutations as a result of a mismatch repair-deficient hypermutable state of the sputum isolate. The inferred lack of purifying selection against nonsynonymous mutations and the high rate of pseudogenization in the derived isolate indicated limited evolutionary pressure during evolution in the nutrient-rich, stable CF sputum environment. The present study is the first to examine the genomic and transcriptomic differences between longitudinal isolates of B. contaminans. Detected activity of a number of putative virulence factors implies a genuine pathogenic nature of this novel Bcc species.


Introduction
Bacteria from the Burkholderia cepacia complex (Bcc) cause serious respiratory infections in cystic fibrosis (CF) patients and thus significantly contribute to the reduction of life expectancy in CF [1]. Although the Bcc currently comprises 20 different bacterial species, only three of them have been under the spotlight of CF researchers until recently: Burkholderia cenocepacia (former genomovar III), which was recognized as a dominant Bcc species with arguably the highest potential for inter-patient transmission (evident from past large epidemic outbreaks caused by B. cenocepacia strains ET12, PHDC or ST32 [2]); Burkholderia multivorans (former genomovar II), which within the last 15 years replaced B. cenocepacia as the primary species of Bcc infection in many countries [3]; and Burkholderia dolosa (former genomovar VI), which is a far less common Bcc species than the former two but is equally clinically important [4]. The most worrisome feature of the three Bcc species in the CF care is the possibility of developing a fatal septic condition called cepacia syndrome [4][5][6].
Burkholderia contaminans was described taxonomically in 2009 as one of the latest Bcc members [7]. Its history related to the CF lung disease started in the early 2000s, when the species was recovered from respiratory samples from Portuguese and Argentinean patients [8,9]. At that time, the bacteria were incorrectly or indefinitely classified, sometimes as taxon K, based on the restriction fragment length polymorphism (RFLP) pattern of the recA gene. Later, a study aiming to unravel the controversial identification of the B. contaminans metagenome in sea water (designated as Burkholderia SAR-1 at that time) indicated that B. contaminans was the species epidemiologically linked to the contamination of medical devices and products [10]. Today, B. contaminans is reported to be the most frequent Bcc species in Spanish [11] and Argentinean CF patients [12].
B. contaminans infections in CF are often of transient nature [13] and thus can be perceived as respiratory tract colonization rather than true infection. However, some patients develop chronic infection with long-term culture positivity [12,13]. Furthermore, some of the cases can flare up and result in fatalities. Based on this unfavorable clinical manifestation, B. contaminans is likely an emerging CF pathogen. The circumstances leading to the establishment of true B. contaminans infections are unknown.
Recently, we performed a thorough transcriptomic analysis on B. cenocepacia isolates that were recovered from the bloodstream of CF patients with end-stage disease to identify the bacterial factors possibly associated with cepacia syndrome [14]. In this study, we applied an analogous approach. To gain insights into B. contaminans evolution during chronic infection, we compared transcriptomes and genomes of B. contaminans isolates recovered from different stages of CF infection from a single patient.

Results/Discussion
We analyzed seven B. contaminans isolates that had been recovered on four separate occasions over the course of 6-year long chronic infection ( Table 1). The patient died at the age of 10 upon the developing septic state with B. contaminans detected in her blood culture. All isolates were assigned to the multilocus sequence type (ST) 872. ST872 shares the recA gene-ST-71 allele with the predominating B. contaminans population in Argentina [12].

Transcriptomic differences between ST872 isolates
Initially, we performed comparative transcriptomics between isolate 467_S from sputum and isolate MF16_B from blood. Due to their different sites of origin and dates of collection, we expected to identify changes in gene expression that would be linked to the increasing severity of the infection. The experimental layout included three growth media (CF sputum, CF serum and control medium) and three replicates per isolate and condition. Among the 7,109 proteincoding genes, we observed a greater than 3-fold change in expression in more than 18% of the genes (1,308) under at least one of the three cultivation conditions assessed (for a complete set of genes, see S1, S2, S3 and S4 Tables, Table 2). Altered expression was detected for genes with a well established function in Burkholderia virulence as well as for more recently discovered genetic clusters with only a putative role in the Bcc pathogenicity.
Quorum sensing (QS) systems. QS is known to regulate a multitude of functions that are involved in Bcc virulence in various laboratory models (reviewed in [16]). The genes encoding signal molecule synthases as well as virtually all QS-regulated virulence-related functions were expressed at higher levels in MF16_B compared with 467_S. They included the extracellular metalloproteases ZmpA and ZmpB [17,18], siderophore ornibactin [19], lectins BclA and BclC [20][21][22], the nematocidal protein AidA [23], Flp-type pili and type III (T3SS) and type VI secretion systems [24] (S2 Table). However, the extent of expression differences varied considerably among particular genes and among cultivation conditions, with no obvious trend between the expression levels of QS signaling molecule synthases and the transcription of QSregulated genes.
Bcc QS systems represent an intricate network of pleiotropic, mutually agonistic or antagonistic and often redundant regulators [16,25], which together leads to a complex output response. A study that examined changes in AHL production by a large Bcc collection during the course of the CF infection found QS signaling to be functionally preserved [26]. Comparison of the gene expression between two B. cenocepacia isolates from different stages of the CF pulmonary infection revealed both upregulation (ornibactin) and downregulation (secretion systems, extracellular proteases) of QS-dependent functions [27]. Our recent study, similar in design, but involving a bloodstream B. cenocepacia isolate, demonstrated a significant increase in expression of the entire T3SS cluster [14], a finding consistent with T3SS expression in MF16_B.
Motility and chemotaxis. Over 70 genes encoding proteins with a motility-related function and spanning all previously described motility clusters [24] exhibited increased expression in the blood isolate MF16_B (S3 Table). This was in marked contrast with our previous observation on B. cenocepacia strain ST32 where all tested blood isolates (obtained at the time of cepacia syndrome) showed a uniform loss of motility or rapid decrease in motility as measured by swimming activity of the isolates [14]. Another study on a wide range of clinical isolates that belonged to various Bcc species found no apparent association between changes in motility and the clinical outcome of the Bcc infection [28].
Biosynthesis of antifungal compounds. Two gene clusters responsible for the biosynthesis of antifungal compounds showed significantly higher expression in MF16_B than in 467_S (S4 Table). One of them, pyrrolnitrin, which is produced by some Bcc and several other Gramnegative bacteria, inhibits growth of a wide range of fungi and of some Gram-positive bacteria   [29]. Interestingly, a study by Schmidt et al. [30] noted the absence of the pyrrolnitrin biosynthetic operon in all of their examined strains of B. contaminans. However, in our studied B. contaminans strain ST872, all four genes of the pyrrolnitrin biosynthetic pathway were present, and these genes were also found in every other B. contaminans sequenced up to now. Pyrrolnitrin production depends entirely on QS [30]; the detected expression changes are thus in accordance with those observed for other QS-regulated functions whose expression was upregulated in the bloodstream isolate MF16_B. Another antifungal compound, occidiofungin, was first described in B. contaminans MS14 [31], displaying its activity against a wide range of fungi [32]. The biosynthetic cluster, consisting of 18 genes [33], is cis-regulated by two regulators [34]. This cluster is absent in the genomes of some Bcc species, including B. cenocepacia and B. multivorans [35]. The occidiofungin biosynthetic cluster, including genes encoding both regulators (AmbR1 and AmbR2), showed increased expression in MF16_B, most markedly in sputum medium (S4 Table). Similar to pyrrolnitrin, occidiofungin production was found to be positively regulated by QS [36].
lxa locus. In 2013, Sass et al. described a gene cluster in B. cenocepacia strain J2315 (BCAM0275A -BCAM0323) whose expression was induced specifically under low oxygen concentration (low-oxygen-activated or lxa locus) [37]. The locus exhibits an irregular pattern of presence in Bcc bacteria, e.g., being absent in some strains of B. cenocepacia or B. multivorans while being present in the others [37]. Notably, the lxa locus (as delineated in B. cenocepacia J2315) can be found in genomes of all sequenced strains of B. contaminans, either as a complete set of genes (i.e., B. contaminans FFH2055 or B. contaminans MS14; see Table 2) or truncated (missing approximately one quarter of genes from the 3' end of the locus, i.e., B. contaminans LMG 23361), whereas it is absent in Burkholderia lata strain 383, the closest relative of B. contaminans [37]. We detected a rapid increase in expression of the entire cluster in MF16_B vs. 467_S. The expression was most pronounced when the bacteria were cultivated in serum and a control medium ( Table 2). Importantly, all three genes that were predicted to regulate lxa expression (BCAM0049, BCAM0287 and BCAM1483) [37] also displayed increased expression in MF16_B. Moreover, several genes neighboring the lxa cluster (as delineated in B. cenocepacia J2315 annotation) exhibited a similar pattern of differential expression. These genes were homologous to lxa-adjacent genes in the annotated genome of B. multivorans ATCC17616 ( Table 2), suggesting that additional genes form extended, strain-specific lxa clusters. Although low-oxygen level was the only condition capable of inducing lxa transcription in the study of Sass et al. [37], we detected expression of the entire locus under normal aerobic conditions in B. contaminans. We believe that there is another eminent functional significance of this enigmatic part of multiple Bcc genomes.

Phenotypic differences between B. contaminans ST872 isolates
The transcriptomic differences between B. contaminans isolates 467_S and MF16_B allowed predicting changes in their phenotypic features. To verify the transcriptomic results, we assessed four phenotypes: swimming motility (a proxy of flagellar gene activity), proteolysis (zmpA and zmpB genes), antifungal activity (pyrrolnitrin and occidiofungin biosynthetic genes) and hemolysis. We assumed that occidiofungin had hemolytic properties because molecules synthesized by pathways highly homologous to that of occidiofungin were tested hemolytic in B. vietnamiensis [35] and B. ambifaria [38].
The phenotypic differences that had been revealed in agar plate assays were in agreement with changes observed at the transcriptional level (Fig 1): isolate 467_S was nonmotile and did not show any proteolytic or hemolytic activity after 48 hours of incubation; isolate MF16_B exhibited high motility, proteolysis and hemolysis. Antagonism against the fungus Geotrichum candidum was very pronounced in MF16_B, not in 467_S. Although 467_S exhibited slower growth on all agar plates, the differences were equally apparent after prolonged incubation, which gave rise to colonies of 467_S of similar size to that of faster-growing MF16_B (data not shown).
These phenotypic tests were used to screen additional five B. contaminans isolates that had been collected from the patient at various time points. Strikingly, this simple screening indicated coexistence of two distinct phenotypic groups in the subject's lung as well as in blood, termed group A (represented by MF16_B) and group B (represented by 467_S) ( Table 3). It is noteworthy that the phenotype of the blood isolate MF16_B matched the sputum isolate FFH2055 (with the exception of swimming motility), which was retrieved six years earlier. The sputum isolate 467_S had its phenotypic counterpart, isolate MF17, which was obtained from the same blood culture as the MF16_B.
Genomic differences between B. contaminans isolates Genome sequencing of 467_S and MF16_B was performed (Illumina, 2 x 150 bp paired-end sequencing) in order to identify the extent of genetic differences between the phenotypic groups A and B and to complement information from their transcriptomes. Genomic sequences were compared with the reference genome of B. contaminans FFH2055, the initial isolate obtained from the studied patient and sequenced earlier in another study (8 contigs, PacBio sequencing) [15]. Of note, the reference genome was 8.21 Mb in size, encoded 7,109 proteins, and had a GC content of 66.40%.
When the sequencing reads covering MF16_B genome were mapped to the reference assembly, 44 variants were detected between both genomes. These were exclusively single-nucleotide insertions at tandem poly-GC repeats in MF16_B, indicating possible sequencing bias. To resolve this, we remapped Illumina sequencing reads for FFH2055 on the PacBio-assembled reference of the same isolate (S5 Table). This approach revealed that the called differences between FFH2055 and MF16_B genomes come from different sequencing technologies. The omission of nucleotides on poly-GC runs is in line with other observations for PacBio assemblies [39]. After this correction, the genomic sequences of isolates MF16_B and FFH2055 proved to be identical.
Contrary to genome uniformity observed within the phenotypic group A, a substantial number of genetic differences were detected when groups A and B were compared between each other. In total, 1,433 mutations separated the isolates 467_S and MF16_B, representing 0.017% of the total genome size and affecting 1,002 genes (14%) ( Table 4, S6 Table).
Over 200 indel mutations were detected among the two ST872 genomes representing the two phenotypic groups. Most indels occurred as single-nucleotide insertions or deletions at tandem repeats with a strong bias towards GC repeats. The indel mutations caused 94 frameshifts in coding sequences (S6 Table). All affected genes coded for full-length proteins in the reference genome FFH2055 (and in MF16_B), whereas frameshift mutations resulted in their pseudogenization in 467_S. This suggests that isolates from phenotypic group A represent a state ancestral to the derived isolate 467_S, regardless the opposite chronology of collection of respective clinical samples. The notion about the evolutionary order of 467_S vs. MF16_B and FFH2055 was further supported by the basal position of MF16_B (and its genotypically identical early isolate FFH2055) in the phylogenetic tree of B. contaminans ST872 (Fig 2), where isolate 467_S appeared as their genomic derivative. Therefore, the longitudinal position of MF16_B, which was isolated from blood culture, indicates that it might have been re-introduced into the patient from an unknown source (such as some disinfectants, cosmetic products, ultrasound gels or stopcocks for i.v. administration that were all found contaminated with B. contaminans in different Argentinian hospitals in the past; JD unpublished data). It is unlikely for a bacterium to remain genetically unchanged for 6 years of chronic infection; a study on B. dolosa reported average genomic substitution rate of approximately 2 SNPs per year of infection [40]. Nevertheless, we cannot definitely rule out an alternative that some members of the clonal phenotypic group A have persisted genetically unchanged, probably in a dormant state intracellularly. The coexistence of diverse lineages as a result of independent transfer of multiple clones into the patient's bloodstream has been documented in B. dolosa chronic infection [41]. 80% of the detected mutations in 467_S were present in protein-coding genomic regions that occupied 85% of the genome. Among the intragenic single-base substitution mutations, only 230 (22%) were synonymous. This proportion was even lower than a value of 27.8%, which was calculated by Dillon et al. as the theoretical probability of synonymous coding substitutions in B. cenocepacia HI2424 [42]. Taken together with the high number of frameshift mutations, this proportion of synonymous/nonsynonymous mutations indicated absence of strong selection against mutations that would affect a protein function. We can speculate that most of the gene-inactivating mutations (such as frameshifts, nonsense and missense mutations) are tolerable by bacterial cells because they do not disturb vital processes needed for survival of bacteria in CF sputum. A recent study demonstrated that less than 500 genes were essential for P. aeruginosa growth in CF sputum [43], which is a complex environment very rich in organic nutrients that can be utilized by bacteria. Furthermore, it is known that mutations that negatively affect nutrient uptake or their utilization in a CF sputum are not eliminated by natural selection as rapidly as in more selective environments, as manifested by the frequently observed auxotrophy of Bcc and P. aeruginosa CF isolates [44,45]. Regarding the differences in transcriptomes of the two lineages, we found no mutations in genes encoding signal molecule synthases or in their upstream sequences, implicating physiological rather than genetically-based causes of QS expression differences between MF16_B and 467_S. A considerable number of missense and frameshift mutations were identified within motility and chemotaxis genes (7 and 4, respectively-S3 Table). In combination with the very low expression of motility genes and a lack of swimming motility in 467_S and other isolates from phenotypic group B, it indicates that these genes are dispensable during long-term B. contaminans persistence in CF sputum. Decreased production of occidiofungin in isolate 467_S can be, in addition to its diminished transcription, explained by frameshift mutations in two biosynthetic genes (S4 Table).

Hypermutability in ST872
The spectrum of mutations that distinguished 467_S from MF16_B demonstrated a highly uneven distribution in particular mutation types ( Table 4). Notably, a large excess (approximately 30-fold) of transitions compared with transversion nucleotide substitutions was observed. AT!GC mutations prevailed among transitions and accounted for 70% of all nucleotide substitutions. This pattern of mutations was in perfect accordance with the values obtained in a mutation-accumulation experiment with mismatch repair (MMR)-deficient Escherichia coli [46], which is believed to provide the best estimation of the spontaneous mutation rate in bacteria [47]. Therefore, we looked closely at mutations in the genes encoding MMR enzymes, mutS and mutL. Indeed, one amino acid change was identified in each of the translated proteins in 467_S isolate: 628N!D (1882A!G) in MutS and 317S!P (949A!G) in MutL (S6 Table). Both mutations were localized in conserved domains of the MMR proteins and may have therefore abolished their proper function in DNA repair. Notably, both mutations were unique for 467_S and have not been detected in the study that examined many hypermutable Argentinean B. contaminans isolates [48]. Additional analysis of homologous protein sequences of other Bcc clearly showed that corresponding amino acids in isolates FFH2055 and MF16_B represented wild-type state (not shown). These findings suggest that both mutations in MutS and MutL detected in isolate 467_S have arisen during the evolution of ST872 within the patient´s lungs.
Calculation of the mutation frequency confirmed that isolate 467_S, which carried both MMR mutations, was a hypermutator. The mutation frequency of this isolate was over two orders of magnitude higher than that of MF16_B. FFH2055, the first isolate from the patient, exhibited a normal mutation frequency. All other ST872 isolates, collected after FFH2055 and belonging to the phenotypic group B, were hypermutable (Table 3). Thus, the mutator phenotype likely arose early after patient´s colonization and has persisted since then. We assume that during the chronic infection, mutations that escaped correction by MMR have accumulated. The emergence of hypermutators during evolution in CF lungs is a widespread phenomenon that was reported in P. aeruginosa [49], Staphylococcus aureus and Haemophilus influenzae [50,51]. Recently, Martina et al. reported a high prevalence of hypermutable B. contaminans among Argentinean CF patients [48]. Our findings are the first to demonstrate the effect of MMR deficiency on the genomic evolution of B. contaminans during infection in CF hosts.

Conclusions
The presented work is the first study that focuses on the genomic and transcriptomic changes that underlie the progression of chronic B. contaminans infection in CF. Unlike other Bcc species, only little attention has been paid to the mechanisms of pathogenicity of B. contaminans. This lack of information is worrisome in light of increasing proportion and even predominance of B. contaminans infections in Argentina [12], Spain [11] and other Ibero-American countries [8], indicating a large and characteristic geographic distribution of considerable concern.
Our results demonstrate that B. contaminans sequential isolates can hold high genomic diversity despite belonging to the same ST type. This diversity results in differential gene expression patterns, which in turn modify bacterial virulence. We demonstrate the profound effect of DNA repair deficiency on the evolution of the B. contaminans genome in the stable environment of CF sputum. Observed mutational and transcriptional inactivation of Bcc virulence factors might help to unravel the mechanisms that drive the adaptation of this primarily environmental bacterium to the CF lung during chronic infection.

Bacterial strains
Seven isolates of B. contaminans ST872 were obtained from the research Burkholderia strain collection of the University of Buenos Aires, Argentina. All isolates originated from a female CF patient (homozygous for the F508del mutation; born in January 2001) who attended Hospital de Niños Ricardo Gutierrez (Buenos Aires) and provided clinical specimens for microbiological investigation as part of her standard CF care. B. contaminans was cultivated from the patient´s sputum for the first time in July 2005; the lung remained infected since that time. S. aureus was isolated for the first time in January 2002 (episodes of pulmonary exacerbations were treated with amoxicilin-clavulanic acid); chronic infection with P. aeruginosa was first observed in March 2007 (inhaled tobramycin was administered in on/off cycles). Between 2008 and 2009, pulmonary function began to decrease (forced expiratory volume in 1 second [FEV1] 45-48% predicted). The patient started to be on frequent courses of outpatient antiinfective therapy (oral minocycline or trimethoprim-sulfamethoxazole), and was also admitted at the hospital every three months for i.v. antibiotic treatment, combining two or more of the following drugs: piperacillin-tazobactam, ceftazidime, amikacin, colistin and trimethoprimsulfamethoxazole. The patient died in the intensive care unit in October 2011. The death was attributed to B. contaminans septicemia.

Bacterial cultivation and RNA extraction
All bacterial cultures were incubated at 37°C with shaking at 230 rpm in 50 ml Falcon conical centrifugation tubes. The bacteria were cultivated in synthetic basal salts medium (BSM) containing 14.3 mM glucose and 0.05% casamino acids [52], CF serum (pooled serum samples from 5 CF patients in various stages of infection, heat-inactivated at 56°C for 30 minutes) and CF sputum (pooled sputum samples from 5 CF patients of various stages of infection, diluted with BSM to a final concentration of 10% w/vol). We used CF serum and sputum samples that were collected and archived for the purpose of this study at the Motol University Hospital, Prague.
Starter cultures were grown overnight in Luria-Bertani broth and diluted to an OD 600 of 0.5. The bacteria were harvested by centrifugation and resuspended in the same volume of BSM, serum or sputum. In total, 4 ml of BSM and serum and 6.25 ml of sputum were used. The cultures were incubated for 270 minutes (37°C, 230 rpm) into the mid-log growth phase. Then, the cultures were immediately snap-cooled in liquid nitrogen and centrifuged (5 minutes, 160 x g, 4°C). The pellets were resuspended in 1 ml of Trizol (Ambion), vortexed thoroughly and incubated for 5 minutes at room temperature. Chloroform (0.2 ml) was added; the samples were shaken for 30 seconds and incubated for 3 minutes at room temperature. The samples were centrifuged (15 minutes, 160 x g, 4°C), and the upper water phase was transferred to 0.5 ml of ice-cold 70% ethanol. The RNA samples were further processed using the RiboPure-Bacteria kit (Ambion), and the MICROBExpress kit (Invitrogen) was subsequently used to remove rRNA. The quality of the mRNA was assessed using Total RNA Nano chips on a Bioanalyzer 2100 (Agilent).
The complete procedure (bacterial cultivation, RNA extraction) was repeated three times to obtain three biological replicates per condition.

Whole genome sequencing and comparative genomics
To extract the genomic DNA, overnight cultures were grown in LB broth and DNA was extracted from collected cells using the DNAeasy kit (Qiagen). The genomic DNA was converted into a sequencing-ready library using the Nextera XT DNA sample Preparation Kit (Illumina). Illumina sequencing was performed in a MiSeq personal sequencing system with TrueSeq TM sequencing by synthesis (SBS) reversible terminator chemistry at the Next Generation Sequencing Core Platform at the Manitoba Institute for Children's Health (MICH). Data output was assembled by Velvet [53]. Sequencing reads were deposited at the ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress/) under accession number E-MTAB-4649.
Sequencing reads were subsequently mapped onto B. contaminans FFH2055, a reference genome of the multilocus sequence type ST872 [15]. Read mapping was performed with Geneious software 7.1.9 [54] using the Geneious mapper, with a maximum allowed mismatches set to 5%. Variant calling was performed (minimum coverage: 5 reads; minimum variant frequency: 0.75), and the effects of the mutations on the translated proteins were determined.
Complete and draft bacterial genomes for comparison were downloaded from GenBank [55]. A phylogenetic tree was inferred from single nucleotide polymorphisms (SNPs) among the compared genomes using CSI Phylogeny 1.1 [56]. The phylogram was visualized from the tree file using MEGA6 [57].
Overlaps of the mapped reads with annotated genes were counted within the Bioconductor environment [63] using the packages GenomicFeatures and GenomicAlignments [64]. Generally, 60% of the raw reads mapped to the reference genome, which is typically 5.5 million mapped reads per sample (range 3.1-8.9, median 5.5 million reads). Quality control for the count data was performed utilizing principal component analysis (PCA), hierarchical clustering, and correlation analysis. All samples were considered adequate for further analyses. The count data were fitted with a generalized linear model (~source + medium), and differences in the expression intensity between B. contaminans isolates 467_S and MF16_B in different cultivation media (serum, sputum, and BSM) were evaluated using DESeq2 [65]. Differentially expressed genes were selected with a 3-fold change and a p-value of less than 0.05 as the default cutoff. This choice was made due to the high fluctuation of levels of expression among replicates and to select the genes with most pronounced expression changes between the lung and bloodstream isolates.
The nucleotide sequences of differentially expressed genes were subjected to BLAST search [66] against complete Burkholderia genomes. Homologous genes from the first sequenced and best-studied B. cenocepacia strain J2315 [24] were assigned when possible. In cases where no homologs were identified in B. cenocepacia J2315, homologs from other completely sequenced Burkholderia strains with the highest similarity were chosen. Predicted functions were assigned to genes according to the Burkholderia Genome Database [67].

Phenotypic assays
Hemolysis was determined on Columbia sheep blood agar plates (Oxoid) after 48 hours of incubation at 37°C. Protease production was determined on D-BHI agar plates [68] after 48 hours of incubation at 37°C. Production of antifungal substances was determined by the modified method of Chen et al. [69]. Five microliters of B. contaminans suspension (OD 600 = 1) were spot-inoculated on a malt extract agar plate (Oxoid) and incubated at 37°C for 48 hours. The indicator fungus Geotrichum candidum was then replica-plated from a completely covered agar plate (spreadinoculated and incubated for 3 days at room temperature) onto the B. contaminans plate. The inhibition zones were recorded after 24 hours of incubation at 30°C. Swimming motility was determined on Luria-Bertani (LB) plates containing 0.3% (w/v) Bacto agar (Sigma-Aldrich). The plates were inoculated by injecting 2 μl of bacteria in LB broth (OD 600 = 1) under the surface of the center of the soft agar plate. The swimming zones were recorded after 48 hours of incubation at 37°C. The mutation frequency was determined as described in [48] with modifications. One microliter of an overnight culture grown from a single colony was diluted in 1 ml of LB broth. One microliter of the diluted culture was then inoculated into 5 ml of LB broth. The cultures were incubated at 37°C with shaking for 24 hours. The cell concentration was then determined by serial dilution of the cultures, followed by plating 100 μl aliquots in parallel onto LB agar and LB agar + rifampicin (100 μg/ml). Colonies were counted after incubation at 37°C for 48 hours, and the mutation frequency was calculated as the fraction of rifampicin-resistant cells in the population. The procedure was repeated three times per isolate. Supporting Information S1 Table. Genes exhibiting differential expression between B. contaminans isolates 467_S and MF16_B (> 3-fold change, p < 0.05), not included in Table 2, S2, S3 and S4 Tables.  (DOCX) S2 Table. QS regulators and QS-regulated virulence genes with differential expression between B. contaminans isolates 467_S and MF16_B (> 3-fold change, p < 0.05). (DOCX) S3 Table. Motility and chemotaxis genes exhibiting differential expression between B. contaminans isolates 467_S and MF16_B (> 3-fold change, p < 0.05). (DOCX) S4 Table. Antifungal compound biosynthetic clusters with differential expression between B. contaminans isolates 467_S and MF16_B (> 3-fold change, p < 0.05).