Genomics of an endemic cystic fibrosis Burkholderia multivorans strain reveals low within-patient evolution but high between-patient diversity

Burkholderia multivorans is a member of the Burkholderia cepacia complex (Bcc), notorious for its pathogenicity in persons with cystic fibrosis. Epidemiological surveillance suggests that patients predominantly acquire B. multivorans from environmental sources, with rare cases of patient-to-patient transmission. Here we report on the genomic analysis of thirteen isolates from an endemic B. multivorans strain infecting four cystic fibrosis patients treated in different pediatric cystic fibrosis centers in Belgium, with no evidence of cross-infection. All isolates share an identical sequence type (ST-742) but whole genome analysis shows that they exhibit peculiar patterns of genomic diversity between patients. By combining short and long reads sequencing technologies, we highlight key differences in terms of small nucleotide polymorphisms indicative of low rates of adaptive evolution within patient, and well-defined, hundred kbps-long segments of high enrichment in mutations between patients. In addition, we observed large structural genomic variations amongst the isolates which revealed different plasmid contents, active roles for transposase IS3 and IS5 in the deactivation of genes, and mobile prophage elements. Our study shows limited within-patient B. multivorans evolution and high between-patient strain diversity, indicating that an environmental microdiverse reservoir must be present for this endemic strain, in which active diversification is taking place. Furthermore, our analysis also reveals a set of 30 parallel adaptations across multiple patients, indicating that the specific genomic background of a given strain may dictate the route of adaptation within the cystic fibrosis lung.

Introduction Burkholderia cepacia complex (Bcc) bacteria are relatively rare but notorious opportunistic pathogens in cystic fibrosis (CF) patients. They are associated with higher rates of morbidity and mortality, as well as lower post-lung transplant survival [1][2][3][4]. Bcc infections in CF are characterized by highly variable clinical outcomes, but commonly result in a progressive decline of lung function. In extreme cases, Bcc infection can result in "cepacia syndrome", a necrotizing pneumonia and septicemia that engages a lethal prognosis for the patient [5]. Bcc infections are difficult to eradicate because the infecting strains have an innate resistance to multiple antibiotics [3,4,6].
The ability to differentiate Bcc strains has been key in understanding their epidemiology and improving infection control guidelines for the CF community. Multilocus sequence typing (MLST) is a well-established molecular technique to study the epidemiology and population structure of Bcc organisms [18][19][20]. The Bcc MLST scheme is based on the allelic variations of seven housekeeping genes (atpD, gltB, gyrB, recA, lepA, phaC and trpB) and each strain is defined by its unique allelic profile and sequence type (ST) [18,21]. While several B. multivorans STs were shown to be globally distributed [22,23], only a limited number of genetically distinct B. multivorans outbreak strains have been described [22]. The small number of B. multivorans outbreaks and the fact that B. multivorans isolates from CF patients mostly represent unique strains strongly suggest that person-to-person transmissions are limited and that B. multivorans strains are usually acquired from environmental sources [10,12,22,24,25].
B. multivorans has been the most prevalent Bcc species in Belgian CF patients for more than three decades [1,26,27]. B. multivorans isolates from Belgian CF patients typically represent unique strains, but rather unexpectedly, several B. multivorans isolates with the same sequence type (ST-742) were found in multiple Belgian CF patients treated in different CF centers [26,28]. A study was performed to investigate the epidemiology and risk factors that play a role in the spread of these clones, but no evidence to support cross-infection was detected [26], suggesting these patients acquired this B. multivorans strain from the natural environment.
The aim of the current study was to investigate the genomic epidemiology and evolution of B. multivorans ST-742 that were isolated from respiratory samples of four Belgian CF patients over a period of at least ten years. We combined sequencing technologies to enable an optimal resolution in the de novo reconstruction of the genomes and elucidation of structural variations [29,30], with clinical and antibiotic resistance data to discriminate between ST-742 isolates. We show that ST-742 isolates of different patients have distinct genomic profiles and propose that each patient has been colonized independently from a natural reservoir of ST-742 where recombination events diversify the population of this microdiverse endemic B. multivorans strain.

Isolates from four different patients share the same sequence type
Between 2009 and 2019, thirteen B. multivorans isolates with ST-742 were obtained from respiratory samples of four unrelated patients attending three different pediatric CF centers in Belgium (Fig 1, Table 1, Table A in S1 Tables) [26].

Variability of antibiotic resistance profiles in the ST-742 isolates
We tested the susceptibility of our isolates to 20 different antibiotics and interpreted as resistant, low level of resistance, or susceptible based on EUCAST interpretation of the Minimum Inhibitory Concentrations (MIC) value when available, or the species independent Pharmacokinetics/Pharmacodynamics (PK/PD) interpretation otherwise (Fig 2). The complete table with the MIC values is given in Table B in S1 Tables.
Isolates of the first positive cultures (i.e., P1Bm2009 and P3Bm2015) were susceptible to multiple antibiotics (Fig 2). Interestingly, P2Bm2011b, and P4Bm2019 also display similar profiles despite not being the first B. multivorans isolate for the respective patients. For patient 1, pairs of isolates from the same sample with distinct colony morphologies (P1Bm2011a-b, P1Bm2012a-b) showed highly similar or identical profiles; the last isolate (P1Bm2015) was resistant to all antibiotics tested, except trimethoprim/sulfamethoxazole. For patient 2, the two isolates from the same sample with distinct colony morphologies (P2Bm2011a-b) showed different profiles: one isolate (P2Bm2011b) had a resistance profile similar to that of other initial isolates (i.e. P1Bm2009 and P3Bm2015), while the other isolate (P2Bm2011a) had a resistance profile that resembled that of later isolates such as P1Bm2012b. The resistance profile of the isolate P2Bm2018 was most similar to that of P2Bm2011a and P1Bm2012b (Fig 2).

The replicons are collinear and broadly share nucleotide sequence identity across isolates
The hybrid sequencing data, combining Nanopore and Illumina sequencing, enabled highly contiguous de novo reconstructions of the genomes for all thirteen isolates. The genomes ranged between 6.5 and 6.7 Mb in size with a G+C content of approximately 67%. The functional annotation revealed 5,837 to 6,045 coding DNA sequences (CDS) per genome ( Table 1). The annotation of antibiotic resistance genes revealed a single class A beta-lactamase encoding gene that was present in all isolates and was a homolog to penA, previously identified in B. cepacia ATCC 25416 (NG_048030.1). Each of the replicons contained multiple genomic islands that included prophage elements and secondary metabolites (Table C in S1 Tables). No clustered regularly interspaced short palindromic repeats (CRISPRs) were found.
The assembly graphs (S1 Fig) confirmed the typical highly conserved genome structure of B. multivorans with three large replicons [6,23] (from here on referred to as C1, C2 and C3). The genomic sequences of each of the three chromosomes across all isolates were collinear (S2 Fig) and broadly shared nucleotide sequence identity (>99.8%).

The B. multivorans population structure reveals a unique, clonal ST-742 cluster
There are no other reports of ST-742 isolates in the Bcc PubMLST database [31]. Furthermore, MLST analysis of all publicly available B. multivorans genomes (Table D in S1 Tables) did not  Table 1). The 2011 and 2012 isolates of patient 1 and 2012 isolates of patient 2 each originated from the same sputum sample and were studied in parallel as they displayed distinct colony morphologies. The line connecting the isolates indicates the infection timeline, with the infections of patients 2 and 4 still currently ongoing.
https://doi.org/10.1371/journal.ppat.1009418.g001 Table 1. List of isolates included in our study. Isolates marked with an asterisk were obtained from the same sputum sample but displayed distinct colony morphologies.  To further investigate the evolutionary relationships within the ST-742 cluster, a SNP analysis was performed to detect small insertions or deletions (indels) and single nucleotide polymorphisms (SNPs) in the ST-742 genomes and the FDAARGOS_719 genome. The resulting core SNP alignment was used for a second, more detailed phylogenetic analysis (Fig 4). While all patient 1 isolates clustered closely together, this was not the case for the isolates from patient 2. Indeed, the core SNP analysis revealed that the two initial patient 2 isolates (P2Bm2011a-b) were more similar to each other and to the single patient 4 isolate (P4Bm2019), than to the later isolate of patient 2 (P2Bm2018) (Fig 4).

Distinct plasmid content and large structural variations between patients
Plasmids could be detected in at least one isolate of each of the four patients but appears to be lost during chronic infection (patient 1 and 2). Pairwise comparisons of these four plasmid sequences using BLASTn revealed that the plasmid of patient 1 isolate P1Bm2009 had only 4% coverage with high identity (97%) to the group of plasmids found in patient 2, 3, and 4 (P2Bm2011a-b, P3Bm2015, and P4Bm2019). When compared against the NCBI nr/nt database [33], we found a best BLASTn hit (83% identity over 20% sequence coverage) to the pINT23 plasmid of Burkholderia pseudomultivorans SUB-INT23-BP2 (CP013379.1). On the other hand, the group of plasmids from the first isolates of patients 2 to 4 shared a pairwise sequence identity and coverage of over 99% and had a best database hit (99% identity and 66% sequence coverage) to the pTGL1 plasmid of Burkholderia multivorans ATCC 17616 (AP009388.1). The functional content of the plasmids found in P1Bm2009 and P2Bm2011a (taken as representative of the group of plasmids found in P2Bm2011a-b, P3Bm2015, and P4Bm2019) is described in Tables E and F of S1 Tables. The plasmids differ extensively in size (72 kbp for P1Bm2009 and 162 kbp for P2Bm2011a) and in functional content. Indeed, over 24% of the large plasmid genes (P2Bm2011a) are related to DNA maintenance and repair compared to 9% in the small plasmid (P1Bm2009) (Table F of S1 Tables). Both plasmids contain proteins linked to Type IV secretion systems, with the smaller plasmid (P1Bm2009) specifically carrying a VirB4 component. Importantly, we did not identify any antibiotics resistance associated genes on those plasmids.
The long-read sequencing data were further used to survey larger structural variations between isolates ( Table 2, Tables G and H of S1 Tables). This analysis revealed an active role for transposons from families IS3 and IS5 in the deactivation of genes (Table 2), particularly in the isolates from patient 2. Additionally, three complete prophages were present in all isolates except those of patient 2, in which they appear to be actively mobile. Indeed, the prophage located on C3 was absent from all isolates of patient 2, but the prophage on C2 was absent in P2Bm2011b, while the prophage on C1 was absent in P2Bm2011a and P2Bm2018.

Table 2. Large structural variations in the isolates of Patient 2-4.
By setting the isolate P1Bm2009 as a reference, we mapped the long Nanopore reads of the isolates from patients 2 to 4 to identify large structural variations that escape SNPs analysis. We omitted from this list indels found in intergenic regions (11) and hypothetical proteins (8). The full list of structural variations can be found in Tables G and H in S1 Tables.

ST-742 isolates have regions with unusually high SNP densities in C1 and C2
To assess in which regions of the chromosomes SNPs occurred, all identical SNPs found on the different genomes of isolates from patient 1 were collapsed into a single collection of SNPs (85 in total). The same analysis was performed on the isolates P2bm2011a and P2Bm2011b from patient 2 (4,694 in total). However, the later isolate of patient 2 (P2Bm2018) was kept separated as it displayed a large number of unique variants (1,990 SNPs when compared to P2Bm2011a-b, S3 Fig). In general, SNPs were randomly distributed across the chromosomes, with 53, 20 and 12 SNPs on C1, C2 and C3 of isolates from patient 1, respectively. However, an enrichment of SNPs was detected in three regions for the isolates from the other patients: a first centered around position 900 kbp for C1 (Fig 5), and a second and third in C2 around 170 kbp and 310 kbp, respectively (Fig 6). No such pattern was observed in the smaller C3 (Fig 7). On C1, this region of P2Bm2011a-b, P2Bm2018, P3Bm2015 and P4Bm2019 all ended at the exact same position (991,509) but started at different positions (702,401-794,204) (Table J in S1 Tables). On C2, the enrichment of SNPs was observed in all isolates except in isolate P3Bm2015 (Fig 6). These regions all start at the same position (150,841 and 318,125) but ended at different positions (193,566-204,452 and 331,399-331,929). The analysis of the functional content found in these regions did not reveal specific enrichments for certain COG categories when compared to the whole genomic background of the strain (Table M in S1 Tables).

Genome evolution of patient 1 isolates over the course of the infection and parallel evolution in different patients
To reveal within-patient genomic variations at the level of SNPs and larger structural variations, the genomes of all sequential isolates of patient 1 were compared using isolate P1Bm2009 as a reference. When collapsing all SNPs as described above, a total of 85 unique SNPs were detected on the three chromosomes. Many of these mutations were fixed over time once they appeared (Fig 8, Table I in S1 Tables). Some genes had mutations at multiple independent positions (Fig 8, Table 3). Both pairs of isolates P1Bm2011a-b and P1Bm2012a-b, which originated from the same sputum samples but displayed a different colony morphology, also showed a distinct SNP profile. This suggests an evolution of different population lineages within this patient. These 85 SNPs occurred over six years, yielding an average mutation rate of 15 SNPs/year (S4 Fig). Additionally, we searched for evidence of parallel evolution of loci across multiple patients. In total, 30 CDS (and ten non-coding regions) were detected with mutations in at least three patients, and only two loci (129 and 6144) had mutations in all patients ( Table I in S1 Tables  and Table 4).
Larger structural variations that were not detected using the SNP analysis included the loss of a plasmid that was only detected in the first isolate P1Bm2009 (see above), as well as several deletions and insertions, some of which could again be linked to transposase activity ( Table 5, Table H in S1 Tables).

Discussion
B. multivorans has been the most prevalent Bcc species in Belgian CF patients for more than three decades [1,26,27] and patients generally carry distinct sequence types, suggesting the presence of many different B. multivorans STs in environmental reservoirs [22]. In contrast, between 2009 and 2019, ST-742 isolates were found in respiratory samples of four patients attending three different CF centers [26,28]. To investigate the epidemiology and evolution of these ST-742 isolates, we examined their genome sequences and their susceptibility to antimicrobial agents commonly used in CF therapy.
The high level of sequence identity, the collinearity of the replicons (S2 Fig), and the comparison with other publicly available B. multivorans genomes (Fig 4) demonstrated that all ST-742 isolates belonged to a single genomic lineage and that the shared ST was not simply a result of homoplasy [34,35]. This also corroborated an earlier study that showed that the ST predicted both phylogeny and gene content of B. multivorans isolates, supporting the use of MLST for epidemiological surveillance of Bcc bacteria [23].
The lack of evidence to support cross-infection [26] suggested that all patients acquired B. multivorans ST-742 from an environmental reservoir and that this strain is endemic in Belgium. A recent genomic study of B. multivorans reported that the same genomic lineages were isolated from CF and environmental samples and on different continents many years apart, In concentric circles, from the center and outward are depicted: 1) the GC skew, 2) the GC content, 3) the genomic islands detected by the software IslandViewer 4 (in purple), the prophages detected with Phaster (green) and the secondary metabolites identified by antiSMASH (in blue), 4) a total of five SNP densities, we separated the isolates from 2011 and the isolate from 2018 in patient 2 as we regard them as two separate acquisition events, 5) and 6) the annotated features found on the negative and positive strand respectively.
https://doi.org/10.1371/journal.ppat.1009418.g005 demonstrating the evolutionary persistence and ubiquity of these strains in different niches and on different continents [23]. Intriguingly, although several studies point towards the acquisition of B. multivorans from non-human sources such as the natural environment, its preferred natural habitat remains elusive [4,36].
Different elements suggest that this environmental reservoir holds a microdiverse B. multivorans ST-742 population [37] that is susceptible to several antimicrobial agents and from which different genetic variants colonized different patients. First, there is a considerable SNP diversity between genomes of isolates of different patients compared to the more limited diversity observed among the genomes of patient 1 isolates (Fig 4). Second, we observed different start positions of the high-density SNP regions in isolates from different patients (Figs 5, 6 and 7, Table 2). Third, we observed an apparent mobility of prophage elements which are differentially present in some isolates. Lastly, there was a shared plasmid present in the isolates of patient 2, 3 and 4 and a unique plasmid in the first isolate of patient 1 ( Table A in S1 Tables). Indeed, this microdiverse B. multivorans ST-742 population appears to comprise different plasmids which can be dispensed once in a human host. De Boeck et al. [1] reported on the re-appearance of Bcc strains up to ten years after first colonization and hypothesized that reacquisition from the environment occurred. In patient 2, we hypothesize that isolate P2Bm2018 is derived from a re-infection from the same environmental reservoir and is not an immediate evolutionary descendant of the previously reported isolates of this patient (P2Bm2011a-b). This hypothesis is supported by the position of isolate P2Bm2018 in the core SNP tree (Fig 4), its different SNP profile (Figs 5, 6 and 7) with 1,990 unique SNPs when compared to P2Bm2011a-b, the presence of different regions with high SNP content in C1 and C2 (Table J in S1 Tables) and a differing pattern of prophage elements between early isolates (P2Bm2011a-b) and the late isolate (P2Bm2018). However, no plasmid was detected in P2Bm2018 and the antibiotic resistance profile of P2Bm2018 resembled that of P2Bm2011a (Fig 2), potentially indicating that the lineage of that isolate was already established in the lungs of the patient before 2018.
Comparative genomic analyses of Bcc isolates sampled from single patients in the course of chronic infections have shown that diversifying lineages can co-exist in the CF lungs for many years, with limited rates of evolution (as proxied by single nucleotide polymorphisms), i.e. around 2.1 to 2.4 SNPs/year [38][39][40][41][42][43]. This evolution within the lungs of CF patients can result In the present study, the presence of co-existing clades was supported in samples of patients 1 and 2, in which the same sample yielded isolates with distinct colony morphologies that showed distinct antibiotic resistance profiles (Fig 2, P2Bm2011a-b) and SNP profiles (Fig 8,  P1Bm2011a-b and P1Bm2012a-b). Among the eight isolates of patient 1, no gradual increase in antimicrobial resistance was observed, as the isolates collected in 2011 were already fully resistant to virtually all compounds tested (Fig 2). We found an average mutation rate of 15 SNPs/year (S4 Fig) which  As expected from the evolution within the lungs of CF patients, we found a subset of genes with mutations at multiple independent positions (Table 3) and 30 loci that had a mutation in at least three patients ( Table 4). As this latter set of genes underwent parallel adaptation in different patients, these genes may be involved in the adaptation of the ST-742 strain to the CF lung environment. Importantly, we sought to correlate the loci we discovered to those found in the B. multivorans literature. However, we found little commonalities despite almost all loci being present on the ST-742 strain ( Table 3, Table L in S1 Tables). Indeed, only three mutations (i.e., ABC transporter permease, DNA-binding protein, and an acyltransferase) were found in common with the fixed mutations found in the study by Silva et al.  When available, we additionally indicate the annotated function that is associated with the locus. The mutations that occur at multiple independent positions in the same locus, which are indicative of selective pressure, are marked with number between parentheses (Table 4).
https://doi.org/10.1371/journal.ppat.1009418.g008 Table 3. Loci with multiple independent mutations in patients 1 and 2. Genes with multiple mutations in the corresponding CDSs of the longitudinal isolates from patient 1 (P1bm2009 -P1Bm2015) and patient 2 (P2Bm2011a, b), see Tables I and K in S1 Tables respectively. These adaptive mutations have not been reported in the B. multivorans literature (Table L in S1 Tables).

Gene Product (locus) Patient Function Previously identified
PurB ( Table I in S1 Tables), the gene is found correctly annotated de novo in those isolates. Additionally, these regions were not identified as genomic islands, nor were they flanked by mobile elements. Functional analysis of their coding content did not reveal enrichments for specific COG categories when compared to the complete genome, which may indicate that the recombination of this region is tolerated by ST-742 (Table M in S1 Tables). The potential presence of these regions in other B. multivorans genomes (including both 97 public and 59 non-public sequences from the National Reference Centre database) was analyzed using BLASTn but no evidence was found that they originated from recombination between any of those strains. This suggested that they may represent events of recombination with donor strains for which no genome sequences are currently available. Additionally, as the start positions of the high-density SNP regions differ in isolates from different patients and share significant numbers of SNPs (Figs 5, 6 and 7 and S3, Table J in S1 Tables), these recombination events likely took place prior to infection, not during co-infection of the patients airways, and contribute to the microdiversity of the B. multivorans ST-742 population in its environmental reservoir [37,46].
Finally, from a technology perspective, we observed that while short read sequencing technologies such as Illumina were well suited to examine SNP diversity, long-read sequencing technologies readily allowed us to observe differences in plasmid and prophage content. In patient 2, we observed that prophages were mobile (Table 2), with an uncharacterized prophage element that was present on C1 of P2Bm2011b but absent from P2Bm2011a and P2Bm2018, and a second prophage, Burkholderia virus KS5 [47], that was present on C2 of P2Bm2011a and P2Bm2018, yet absent from P2Bm2011b. This could be relevant as previous studies in P. aeruginosa highlighted the importance of mobile prophage elements in driving within-host adaptation of the infecting strains [48]. Interestingly, we also noted two variants (deletions) in phosphoribosyltransferase and glycosyltransferase proteins that only appeared in the large structural variation analysis (Table 5) and corresponded to loci reported by Caballero et al. [38]. These findings illustrate the limitations of Illumina-based SNP analysis in providing a comprehensive list of genomic changes among isolates and emphasize the strength of combining sequencing technologies.

Ethics statement
The strains used in this study were collected by the participating clinical laboratories in the frame of routine diagnostic without additional testing and sent to the National Reference Center for Burkholderia cepacia complex where they were stored in the frame of activities mentioned in the Royal Decree 9 February 2011 setting the conditions for funding for reference centers in human microbiology and precised on the website of Sciensano: https://nrchm.wiv-isp.be/nl/ oproep2019/legaal/default.aspx. In the frame of the specific terms of reference for the NRC for Burkholderia cepacia complex, a collection of representative strains was managed as outline in https://nrchm.wiv-isp.be/nl/oproep2019/lastenboek/Rapporten/STR06-Burkholderia% 20cepacia_2019.pdf. Epidemiological data were collected retrospectively from patient charts and anonymously stored in a database. As no additional sampling or information was asked from patients, no formal approval from an ethical committee or informed consents are needed.

Bacterial isolates and patient data
The thirteen clinical isolates presented in this study (Table 1) were isolated from sputum samples of four patients diagnosed with cystic fibrosis (Fig 1) and were collected at the Belgian National Reference Centre for Burkholderia cepacia complex (NRC Bcc). Isolates were identified as Burkholderia multivorans using MALDI-TOF mass spectrometry and recA gene sequence analysis [19,49]. Isolates were further molecularly characterized by multi-locus sequence typing (MLST) using the Bcc MLST scheme and the pubMLST database [19,21]. Isolates were grown aerobically in lysogeny broth (LB) medium at 37˚C and cultures were preserved in 25% glycerol at -80˚C.
For patient 1, B. multivorans was first detected in a sputum sample at age nine (P1Bm2009, January 2009). The initial treatment was intravenous (IV) ceftazidime, piperacillin-tazobactam and ciprofloxacin for 14 days, followed by inhaled temocillin and oral ciprofloxacin for 3 months. Cultures were negative until August 2009, date after which they remained positive until 2015, when the lung function declined drastically, and the patient received a double lung transplant. In 2011, two isolates from the same sputum sample were examined as they displayed a different colony morphology (P1Bm2011a and P1Bm2011b), and similarly in 2012 with isolates P1Bm2012a and P1Bm2012b (Fig 1).
For patient 2, B. multivorans was first detected in a sputum sample, at age five (December 2009, isolate not available for the NRC Bcc) and the infection is still ongoing at the time of the present study. Treatment has included IV administration of ceftazidime-amikacin, tobramycin, piperacillin, tazobactam, ciprofloxacin, and meropenem. Two isolates from the same sputum sample were sent to the NRC in 2012 as they displayed distinct colony morphologies (P2Bm2012a and P2Bm2012b), and an additional sample was received in 2018 (P2Bm2018) (Fig 1). Intermediate isolates (period 2012-2018) and later isolates (period 2019-2020) are not available for the NRC Bcc.
For patient 3, B. multivorans was first detected in a sputum sample at age nine (P3Bm2015, November 2015). The infection was successfully eradicated with an IV course of ceftazidime and tobramycin for 14 days, followed by 3 months of inhaled ceftazidime. No further samples were collected since. Table 5. List of large structural variations in the isolates of Patient 1. Using isolate P1Bm2009 as a reference, we mapped the long Nanopore reads of the subsequent isolates from Patient 1 to reveal large structural variations that escape SNP analysis. All structural variations are found in C1 and are fixed after their appearance in agreement with the SNP analysis (Fig 8). Isolates P1Bm2011a and P1Bm2012b, which originated from the same sputum sample as P1Bm2011b and P1Bm2012a, respectively, appear to come from separate lineages.

Strains
Chr Type Function

PLOS PATHOGENS
For patient 4, B. multivorans was first detected in a sputum sample at age 25 (June 2017, not available for the NRC Bcc) and an isolate was sent to the NRC in December 2019 (P4bm2019) (Fig 1). The infection was treated for 14 days using intravenous meropenem, and no further therapy was administered due to other health related complications.

Antibiotic sensitivity testing
We tested the in vitro susceptibility of all isolates against twenty different antibiotics (Table B in S1 Tables) as described previously [50]. The MIC values were determined by microdilutions in microtiter plates which were read on a Sensititre Vizion System device (ThermoScientific). To determine the in vitro susceptibility, EUCAST pharmacokinetic/pharmacodynamics (PK/ PD) breakpoints were used, as no EUCAST species-specific breakpoints were available (tables available at https://www.eucast.org/clinical_breakpoints/ PK PD breakpoints).

Genomic DNA isolation and hybrid sequencing
The isolates were inoculated in LB medium and grown overnight at 37˚C to stationary phase (OD 600 over 1.0). The genomic DNA was extracted using the QIAGEN DNeasy ultraclean microbial kit. The quality and quantity of the extracted DNA was assessed using a Qubit 4.0 fluorometer, ThermoFisher Scientific Nanodrop spectrophotometer (OD280/260 and OD230/ 260), and agarose gel electrophoresis (1% w/v).
The genomic DNA was sequenced combining both short-read and long-read technologies. The first set of reads were obtained on an Illumina HiSeq4000 or NovaSeq6000 platform using a paired-end 2 � 150 bp approach at the Oxford Genomics Centre. The second set of reads were obtained on a MinION nanopore sequencer (Oxford Nanopore Technology) equipped with a flowcell of type R9.4.1 and a library prepared either with the 1D ligation approach or the Rapid library preparation kit (Table A in S1 Tables).

Population structure and MLST
To examine the position of the clinical isolates within the general population structure of B. multivorans, all B. multivorans genomes were retrieved from the NCBI RefSeq database (release 98) as GenBank files (Table D in S1 Tables), together with B. cenocepacia J2315 [66] as an outgroup. Genomes with GC contents below 66% (one isolate), and more than 1,000 contigs (seven isolates) were excluded. Remaining GenBank files were converted into gff3 files using the perl script "bp_genbank2gff3" available in the BioPerl software library [67] and given as input to roary v3. 1 3 [68] to delineate the pangenome and perform a core gene multiple sequence alignment. The core gene alignment was processed using FastTree v2.0 [69] to reconstruct the population structure. The phylogenetic tree was visualized and annotated using Fig-Tree v1.4.4 [70]. MLST analysis on all downloaded genomes was performed using mlst (https://github.com/tseemann/mlst) and the Bcc pubMLST database [21].

SNP analysis
SNP calling was done using snippy v.4.4.5 [71] with the quality-controlled Illumina reads of each isolate and the annotation of isolate P1Bm2009 or P2Bm2011a as reference. The SNP density was calculated by tabulating the number of unique SNPs per 1000 nucleotides windows along the genome (Figs 5, 6 and 7 and Table I in S1 Tables). The visualization of shared SNPs (S3 Fig) was created with the software UpsetR [72] and the Venn web application (http:// bioinformatics.psb.ugent.be/webtools/Venn/). The core SNP alignment from snippy was used to construct a focused phylogenetic tree of our ST-742 isolates and the strain FDAARGOS_719 using FastTree v2.0 [69] and visualized using FigTree v1.4.4 [70]. Comparisons with previously published mutation data was conducted using BLASTp between the proteome of P1Bm2009 and the proteins reported in the studies listed in the Table L in S1 Tables.

Large genomic structural variations
Overall sequence identity as well as collinearity of the genomes was examined by creating dot plot figures (S2 Fig) using Gepard v1.40 [73] with default options and word length adjusted to 100. The large structural variations were called using a combination of the long-read mapper ngmlr v0.2.7 [74] and the structural variation caller sniffles v1.0.11 [74] with the quality-controlled Nanopore reads and P1Bm2009 was as reference. The results were visually inspected using Ribbon v1.1 [75] and all the VCF files were consolidated into a spreadsheet, with the following filters applied: 1) variations related to the linearization of the replicons or smaller than 100 nt were removed, 2) imprecise variations were removed except when they matched a precise variations in another isolate, and 3) remaining variations in P1Bm2009 (three in total) were used to filter out false positives.  Table A: isolates description. The thirteen ST-742 isolates are listed with a summary description of their sequencing and the related genomics data available the NCBI Gen-Bank database.  Table C: location of genomic islands and genetic clusters. Results from the three different software tools to predict the genomic locations of mobile elements and secondary metabolites clusters. Table D: list of isolates from NCBI used for the population structure. The genome sequences of the isolates were downloaded from the RefSeq database (release 98). Table E: features of plasmid pP1Bm2009. List of annotated features of plasmid 1 in patient 1 annotated using hmmer and eggNOG with a summary count of genes per COG categories. Table F: features of plasmid pP2Bm2011a. List of annotated features of plasmid 1 in patient 2 annotated using hmmer and eggNOG with a summary count of genes per COG categories and a comparison of COG categories between pP1Bm2009 and pP2Bm2011a. Table G: master list of large structural variations. List of all structural variations identified using the long reads (Nanopore data).