Phylogeography and transmission of M. tuberculosis in Moldova: A prospective genomic analysis

Background The incidence of multidrug-resistant tuberculosis (MDR-TB) remains critically high in countries of the former Soviet Union, where >20% of new cases and >50% of previously treated cases have resistance to rifampin and isoniazid. Transmission of resistant strains, as opposed to resistance selected through inadequate treatment of drug-susceptible tuberculosis (TB), is the main driver of incident MDR-TB in these countries. Methods and findings We conducted a prospective, genomic analysis of all culture-positive TB cases diagnosed in 2018 and 2019 in the Republic of Moldova. We used phylogenetic methods to identify putative transmission clusters; spatial and demographic data were analyzed to further describe local transmission of Mycobacterium tuberculosis. Of 2,236 participants, 779 (36%) had MDR-TB, of whom 386 (50%) had never been treated previously for TB. Moreover, 92% of multidrug-resistant M. tuberculosis strains belonged to putative transmission clusters. Phylogenetic reconstruction identified 3 large clades that were comprised nearly uniformly of MDR-TB: 2 of these clades were of Beijing lineage, and 1 of Ural lineage, and each had additional distinct clade-specific second-line drug resistance mutations and geographic distributions. Spatial and temporal proximity between pairs of cases within a cluster was associated with greater genomic similarity. Our study lasted for only 2 years, a relatively short duration compared with the natural history of TB, and, thus, the ability to infer the full extent of transmission is limited. Conclusions The MDR-TB epidemic in Moldova is associated with the local transmission of multiple M. tuberculosis strains, including distinct clades of highly drug-resistant M. tuberculosis with varying geographic distributions and drug resistance profiles. This study demonstrates the role of comprehensive genomic surveillance for understanding the transmission of M. tuberculosis and highlights the urgency of interventions to interrupt transmission of highly drug-resistant M. tuberculosis.


Methods and findings
We conducted a prospective, genomic analysis of all culture-positive TB cases diagnosed in 2018 and 2019 in the Republic of Moldova. We used phylogenetic methods to identify putative transmission clusters; spatial and demographic data were analyzed to further describe local transmission of Mycobacterium tuberculosis. Of 2,236 participants, 779 (36%) had MDR-TB, of whom 386 (50%) had never been treated previously for TB. Moreover, 92% of multidrug-resistant M. tuberculosis strains belonged to putative transmission clusters. Phylogenetic reconstruction identified 3 large clades that were comprised nearly uniformly of MDR-TB: 2 of these clades were of Beijing lineage, and 1 of Ural lineage, and each had additional distinct clade-specific second-line drug resistance mutations and geographic distributions. Spatial and temporal proximity between pairs of cases within a cluster was associated with greater genomic similarity. Our study lasted for only 2 years, a relatively short duration compared with the natural history of TB, and, thus, the ability to infer the full extent of transmission is limited.

Conclusions
The MDR-TB epidemic in Moldova is associated with the local transmission of multiple M. tuberculosis strains, including distinct clades of highly drug-resistant M. tuberculosis with varying geographic distributions and drug resistance profiles. This study demonstrates the role of comprehensive genomic surveillance for understanding the transmission of M. tuberculosis and highlights the urgency of interventions to interrupt transmission of highly drugresistant M. tuberculosis.

Author summary
Why was this study done?
• The transmission of multidrug-resistant tuberculosis (MDR-TB) poses a major challenge for tuberculosis (TB) control in several countries, but a detailed understanding of the local dynamics of TB and MDR-TB transmission in these high MDR-TB burden settings has been elusive.
• The increasing availability of whole genome sequencing, and the development of new statistical approaches for combining spatial, epidemiological, and genomic data to infer transmission, offers new opportunities to identify TB transmission with high resolution.
What did the researchers do and find?
• We prospectively enrolled all individuals with incident culture-positive TB from the Republic of Moldova, a high MDR-TB burden setting, between January 2018 and December 2019 and sequenced a diagnostic Mycobacterium tuberculosis isolate from each individual.
• We found that that nearly all extant MDR-TB in Moldova is likely the result of recent transmission and that multidrug resistance (MDR) is highly concentrated within 2 M. tuberculosis lineages (Beijing and Ural).
• Phylogeographic analyses revealed geographically distinct patterns of transmission for the Beijing MDR strains, which were predominantly localized within the Transnistrian region to the east of the country, while Ural MDR strains were less geographically restricted.
• Each putative MDR-TB transmission cluster had distinct second-line drugs resistanceconferring mutations. Population genetic analyses revealed both long periods of local population expansion as well as more recent introduction of specific MDR-TB strains into the country.

Introduction
Multidrug-resistant tuberculosis (MDR-TB) (i.e., resistance to at least rifampin and isoniazid) poses serious threats to effective tuberculosis (TB) control in many countries. Globally, approximately 4% to 5% of incident TB cases are multidrug resistance (MDR), but this is substantially higher in countries of the former Soviet Union where MDR-TB represents >20% of new TB cases and >50% of previously treated TB [1]. MDR-TB in this region has been attributed to breakdowns in public health infrastructure, transmission of TB in hospitals and prisons, and a deterioration of living conditions coinciding with the dissolution of the Soviet Union in the early 1990s [2]. While the contributions of these factors remain uncertain, there is consensus that the transmission of MDR-TB, as opposed to resistance acquired through inadequate treatment of drug-susceptible TB, is now the predominant cause of incident MDR-TB [3]. This consensus is supported by routine surveillance data that document that the majority of incident MDR-TB episodes are diagnosed among individuals with no prior anti-TB treatment [1]. However, these data alone do not address critical questions about where and between whom MDR-TB is transmitted or reveal the extent to which specific M. tuberculosis variants are responsible for MDR-TB transmission.
The increasing availability of next generation sequencing (NGS), coupled with the development of analytic approaches for integrating high-resolution genomic, spatial, and epidemiological data, has transformed our ability to describe transmission of pathogens in populations [4][5][6]. Previous genomic analyses of TB from the former Soviet Union have described the emergence and evolution of specific M. tuberculosis lineages responsible for an outsized proportion of MDR-TB in the region. In general, these studies have been conducted on isolates enriched for drug resistance phenotypes or on samples from larger cohorts [7][8][9], and this can challenge transmission inference.
We systematically collected and sequenced initial diagnostic isolates from all culture-positive TB cases occurring over 2 years in the Republic of Moldova, a former Soviet country experiencing a severe MDR-TB epidemic. In addition to capturing M. tuberculosis isolates from all culture-positive cases, we also collected data on home location and other demographic and epidemiological data, allowing us to study the distribution and dynamics of TB with high resolution across the entire country.

Study setting
Moldova is a small country (approximately 4 million population), which gained independence when the Soviet Union dissolved in 1991. In 2019, the World Health Organization estimated an incidence rate of 80 TB cases (68 to 92) per 100,000 persons. A total of 33% (30% to 35%) of new TB cases and 60% (56% to 64%) of previously treated TB cases were estimated to have MDR-TB [1].

Study enrollment
TB diagnosis occurs at 46 diagnostic centers located throughout the country. Between January 1, 2018 and December 31, 2019, all nonincarcerated individuals evaluated for pulmonary TB were invited to participate in this study (Fig A in S1 Appendix); written consent was provided. This consent allowed us to access routinely collected basic demographic, residential, and epidemiological data and to perform sequencing on their mycobacterial isolates should they have culture-positive TB. This study was approved by the Ethics Committee of Research of the Phthisiopneumology Institute in Moldova and the Yale University Human Investigation Committee (No. 2000023071).

Data and specimen collection and processing
Demographic data (sex, age, employment, history of incarceration, and education level), residential status (rural or urban residence and home village/locality), and epidemiological data (household contacts and date of diagnosis) were collected from each participant.
Sputum specimens were tested at diagnostic centers by microscopy and Xpert and then sent to 4 in-country laboratories for solid and liquid culture. Positive cultures were sent to the National TB Reference Laboratory in Chisinau for mycobacterial DNA extraction by the cetyl trimethyl ammonium bromide (CTAB) method [10].

Whole genome sequencing
Genomic DNA was prepared for NGS using the Illumina DNA Prep library preparation kit (S1 Appendix). Raw sequencing files were checked with FastQC [11] and mapped to the H37Rv reference strain (NC_000962.3) using BWA "mem" [12] and sorted with SAMtools v.1.10 [13] (S1 Data). Variant calling was conducted with GATK [14] to identify single nucleotide polymorphisms (SNPs), with low-quality SNPs (Phred score Q <20 and read depth <5) and sites with missing calls in >10% of isolates removed.
Samples with possible polyclonal infections were identified through a previously described method [15] and were not included in the transmission analysis, although we do provide additional details about these polyclonal infections in the Supporting information appendix (S1 Appendix). Heterogenous sites were called as the consensus allele if present in �80% of mapped reads; otherwise, they were labeled as ambiguous. SNPs in repetitive regions, PE/PPE genes, and in known resistance-conferring genes were excluded from phylogenetic tree reconstruction. In silico drug resistance prediction was carried out using TB-Profiler v2.8.14 (Tables A-C in S1 Appendix) [16].

Phylogenetic analysis and transmission cluster identification
A multiple sequence alignment of concatenated SNPs was used to construct a maximum-likelihood (M-L) phylogenetic tree with RAxML [17], using the "GTR-GAMMA" nucleotide substitution model and a Lewis ascertainment bias correction from 500 bootstrap samples.
Putative transmission clusters were identified in the resulting M-L tree using TreeCluster [18], testing 2 distance thresholds of 0.001 and 0.0005 substitutions/site, corresponding to approximate SNP thresholds of 40 and 20, respectively. These thresholds reflect the maximum distance within a cluster; we also estimate the median pairwise distance within a cluster. Timed phylogenetic trees for each large cluster (�10 cases) identified using the distance threshold of 0.001 substitutions/site were built with BEAST2 v2.6.3. (S1 Appendix) [19]. Briefly, phylogenetic trees were built using a strict molecular clock with a fixed rate of 1.0 × 10 −7 per site per year and constant population model with a log normal [0,200] prior distribution [20]. Markov chain Monte Carlo chains were run for 250 million iterations, with 10% burn-in to produce maximum clade credibility trees. Finally, past population events in 3 large clades identified in the study population were inferred using the Bayesian Skyline model in BEAST2.

Inference of person-to-person transmission events
We identified person-to-person transmission events between sampled hosts in large transmission clusters (�10 cases, TreeCluster distance threshold 0.001 substitutions/site) by reconstructing transmission networks using TransPhylo [21]. This R package uses a Bayesian approach to reconstruct transmission networks from timed phylogenies, including sampled and unsampled hosts, and allows for within-host diversity. We used a "multitree" method that simultaneously infers transmission trees from a selection of input phylogenetic trees while estimating a single value for shared model parameters. This accounts for uncertainty in the phylogenetic tree reconstruction [21]. The procedures for transmission inference within large clusters are detailed in the Supporting information appendix (S1 Appendix).

Spatial/genetic distance analysis
For each large transmission cluster (�10 cases), we used a recently developed hierarchical Bayesian regression model to quantify the association between the genetic and spatial distances for unique pairs of cases, adjusting for other pair-and individual-level features and multiple sources of correlation in the data [22]. We then used a Bayesian meta-analysis framework to better understand shared trends and variability in the estimated associations across genetic clusters.
In our main analysis, we modeled the log-scaled patristic distance between each pair of cases within cluster k as a function of geographic distance and other covariates: where Y kij is the patristic distance between cases i and j within cluster k and � kij � Nð0; s 2 k� Þ are the independent, Gaussian distributed errors. We defined the expected value as a function of pair-and individual-level information, where x kij includes covariates based on differences between the pair (i.e., Euclidean distance in kilometers, an indicator for whether the pair is in the same home village/locality, absolute difference between the dates of diagnosis in days, absolute difference between the ages in years) and z ki includes individual-level covariates (i.e., age in years, number of household contacts, sex (male and female), education status (<secondary and �secondary), working status (employed and unemployed), residence location type (urban and not urban), and housing status (homeless and not homeless)). The θ ki are spatially correlated random effect parameters that account for correlation between paired outcomes due to (i) the same individual being represented across multiple paired responses; and (ii) spatial correlation between individuals. Complete details on the statistical model, including prior distributions for the model parameters, are provided in [22].
We fit the regression model separately for each of the transmission clusters with at least 10 cases, using the "Patristic" function in the R package "GenePair" (https://github.com/ warrenjl/GenePair). For each individual cluster analysis, we included a predictor if <10% of the values across the pairs were missing and if there were >4 pairs in each of the categorical variable levels, to ensure stable model fitting results. Inference was based on 10,000 samples from the joint posterior distribution after removing the first 10,000 iterations prior to convergence and thinning the remaining 100,000 by a factor of 10 to reduce correlation in the posterior samples.
To better understand shared trends and variability in the estimated associations across genetic clusters, we then used the estimates and uncertainty measures obtained from the first stage analyses within a Bayesian meta-analysis framework. The model for a single association l is given asb whereb kl is the posterior mean obtained from the regression model fit to cluster k for covariate l, β kl represents the corresponding true but unobserved value,ŝ 2 kl is the posterior standard deviation, and m l is the number of main analyses (out of 35 in total) where covariate l was included. We note that γ kl effects are included in this same meta-analysis framework as well but describe the model in terms of β kl without loss of generality. We assumed that the true cluster-specific effects arise from a common Gaussian distribution with mean m b l and variance s 2 b l , and estimate these parameters by giving them weakly informative prior distributions such that m b l � N 0; 100 2 À � and s b l � Uniformð0; 100Þ. By making inference on m b l we determined if covariate l had a consistent impact when data were pooled across all clusters and uncertainty in the parameter estimates was correctly quantified. When reporting results from the second stage analysis, we present posterior means and 95% quantile-based credible intervals for expfm b lj g (i.e., the pooled effect on the reported as the ratio of expected patristic distances per specified change in covariate value).
As a sensitivity analysis, we repeated these analyses modeling SNP distance (instead of patristic) using a similar negative binomial regression framework (details in S1 Appendix).

Study population
We invited all culture-positive TB patients (N = 2770) over the study period to participate; 2,405 consented, and, among them, 2,236 (93%) had available isolates for NGS analysis. These patients lived in 709 named localities within 50 regions (Fig 1). Among enrolled participants with treatment history information (N = 2182, Table 1), 31% had been previously treated for TB, 22% were female, and the median age was 43 years (interquartile range (IQR) 23 to 71). A total of 60% lived in rural regions, and 10% were previously imprisoned.
A total of 779 participants (36%) were infected with genetic variants conferring MDR; 50% (386) of these MDR cases were treatment naive (Table 1). There was substantial geographic variation in distribution of MDR-TB. Transnistria, a small region east of the Dniester River, had localities with the highest proportions of TB cases that were MDR and among the highest incidence rates of MDR-TB in the country (Fig 1, Fig B in S1 Appendix).

Genomic analysis and phylogeny reconstruction
We obtained sequence data from pretreatment specimens of 2,220 participants. Polyclonal infections were identified in 386 participants (17.4%) (Fig A and C in S1 Appendix) and removed, resulting in a final dataset of 1,834 M. tuberculosis isolates. Among these isolates, 672 (36.6%) were genotypic MDR-TB, including 319 pre-extensive drug resistance (XDR) (17.4%) and 118 XDR (6.4%) TB. Aligning reads against the reference strain revealed 43,284 SNPs that were used to reconstruct a maximum likelihood phylogeny (Fig 2).
All Beijing/lineage 2.2.1 strains (802 consensus SNP call, 2 heterogenous SNP call) had a specific nonsynonymous mutation in esxW (Thr2Ser), a gene in which mutations were found to be associated with transmission success of Beijing lineages in Vietnam [23]. In contrast, just 3% of non-Beijing strains (32/1,030) harbored this mutation (Table D in S1 Appendix). Additionally, 2 nonsynonymous variants in esxW were found in low frequencies in non-Beijing strains, 6 samples with a nonsense mutation at codon 172, and 17 samples with a Thr173Ser mutation.

Prevalence of drug resistance genotypes
The 3 large clades were comprised almost entirely of MDR isolates (96%, 449 of 466) ( Fig 2B); resistance-conferring mutations for isoniazid and rifampin were similar and found in the katG 315 codon and in the 81-bp rifampicin resistance determination region (RRDR). However, each of these 3 clades had additional distinctive drug resistance mutations: the isolates in Ural strain/lineage 4 clade 1 harbored an eis promoter (−12 C>T) mutation conferring kanamycin resistance, one Beijing strain/lineage 2 clade had an ethA (110-110 del), associated with ethionamide resistance, while the other had thyX (−16 C>T) and thyA (Arg222Gly) mutations, associated with resistance to p-aminosalicylic acid. We also identified clusters of isolates harboring additional drug-resistant mutations associated with drugs in newly recommended MDR treatment regimens including lineozid (n = 14), bedaquiline (n = 1), and delamanid (n = 9). We also reported DR mutations among the 386 mixed samples (Table A and B in S1 Appendix).

Transmission of drug-resistant M. tuberculosis
Of the 1,834 M. tuberculosis isolates, 1,551 (85.6%) formed clusters ranging in size from 2 to 105, and 1,000 (54.5%) belonged to 35 large clusters with at least 10 participants at the clustering threshold of 0.001 substitutions/site. The median SNP distance across all transmission clusters was 14 SNPs (IQR 10 to 18 SNPs), with the median within-cluster SNP distance ranging from 0 to 26 SNPs (Fig D-a in S1 Appendix). Meanwhile, the median SNP distance in a cluster defined using the threshold of 0.0005 substitutions/site was 9 SNPs (IQR 7 to 12 SNPs) (Fig D-b in S1 Appendix).
Of 672 MDR-TB isolates included in the final analysis, 619 (92.1%) were part of a cluster, and 454 (67.6%) belonged to one of the 35 large transmission clusters. Individuals with  (Fig 3A and 3D). In contrast, the next largest cluster (Cluster 2) included 102 participants with the sublineage 2.2.1/Beijing Clade 2 stain living predominately in Transnistria (Fig 3B and 3E). A total of 16 of the 35 large clusters were comprised almost entirely of MDR-TB (Fig 3, Fig E in S1 Appendix). Notably, there were cluster-specific demographic differences observed across transmission clusters, with the largest 2 groups comprising a high proportion of previous prisoners and reporting unsatisfactory living conditions (Fig F in S1 Appendix). Table E in S1 Appendix details the association of

PLOS MEDICINE
Dynamics of M. tuberculosis transmission in Moldova covariates and membership in large clusters, along with a sensitivity analysis defining clusters using a stricter threshold of 0.0005 substitutions/site that showed broadly the same significant associations.
Reconstructing transmission networks in the 35 broad clusters using the multitree Trans-Phylo approach, we inferred 194 person-person transmission events. The relatively short study period allows for limited opportunities to capture transmission chains and pairs, and, accordingly, a minority of clustered isolates were predicted to be involved in transmission events in at least half the posterior transmission trees (338/1,000, 33.8%). Nonetheless, the identification of these transmission events support evidence of recent, local transmission between sampled individuals in the region. We found no significant factors that were associated with inclusion in these person-to-person transmission events compared to other clustered person-to-person individuals, although there was some evidence for an increased likelihood of transmission linkage between hosts in the Transnistria region compared to the rest of Moldova (OR 1.42, P = 0.02).

Bayesian Skyline analysis of large MDR-TB clades
To gain further insight into the population dynamics of MDR-TB in Moldova, we reconstructed the scaled effective population size for the 3 large MDR-TB clades (Fig 2) and estimated the time to the most recent common ancestor (TMRCA) (Table F in S1 Appendix).
We estimated the TMRCA of the Ural/4.  Table G in S1 Appendix), implying more recent introduction of these clades to the region. A sensitivity analysis using alternative clock models and rate estimates (Table G in S1 Appendix) showed similar estimates for the TMRCA and effective population sizes for each clade (Fig G in S1 Appendix).  Table 2 shows the pooled risk ratio (RR) inference for pair-and individual-covariates from the Bayesian meta-analysis of genetic and spatial distances. Two cases in the same locality had a 47% lower expected patristic distance compared to cases in different localities (RR: 0.53; 95% CI: 0.40, 0.68). For cases in different localities, as the distance between the localities increases by 50 kilometers, the patristic distance between the pair increased by 6% (RR: 1.06 (1.03, 1.08)). For every half-year increase in the separation between dates of diagnosis for a pair, the patristic distance increased by 3% (RR: 1.03; (1.01, 1.07)). A sensitivity analyses using SNP distances yielded similar results (Table H in S1 Appendix).

Discussion
We describe the recent circulation of 3 distinct clades of M. tuberculosis (1 of Ural lineage and 2 of Beijing lineage) responsible for the vast majority of MDR-TB in Moldova. While these clades share similar isoniazid-and rifampin-conferring mutations, there are additional cladespecific mutations conferring resistance to important second-line TB antibiotics critical for MDR treatment success.
Broad transmission networks based on genomic similarity showed that >85% of all culture-positive TB cases in Moldova could be mapped to putative transmission clusters and that the majority (>54%) of these cases were found in 35 large transmission clusters. The role of recent transmission was even more pronounced for MDR-TB cases, among which >92% were found within putative transmission clusters (and >67% found within the 35 large transmission

PLOS MEDICINE
Dynamics of M. tuberculosis transmission in Moldova clusters). Individuals with MDR-TB had over 3-fold higher odds of being in a large transmission cluster compared with individuals with pan-susceptible TB. Other notable covariates associated with increased odds of being in a large transmission cluster included urban residence, previous incarceration, and a history of previous treatment for TB. We found that pairs with closer times of diagnosis and living within the same locality had the greatest genomic similarity and that for pairs in different localities, closer spatial proximity was associated with greater genomic similarity. Previous analyses of surveillance data have revealed striking spatial heterogeneity of MDR-TB in Moldova with MDR-TB incidence differing by more than an order of magnitude for different localities [24], but the mechanisms driving this variation have not been described. Our analysis reveals that this heterogeneity is associated with the multiple overlapping epidemics of transmitted MDR-TB, some of which are due to clades that have extended across the entire country, while others are thus far confined to specific subregions. Most notably, the 2 largest transmission clusters of the Beijing lineage are found almost exclusively in Transnistria, where, in some localities, MDR-TB incidence rates exceed 200 cases per 100,000 persons/year. Our finding that nearly all Beijing lineage strains in Moldova have esxW mutations corroborates recent work that suggests that these variants may be under positive selection [23].
A recently reported genomic study conducted among patients diagnosed in 2013 and 2014 at a single municipal hospital in Chisinau described the local concentration of MDR-enriched lineage 4.2.1 (Ural) isolates [25]. In the current study, conducted approximately 6 years later and inclusive of the entire country, we found that MDR isolates within this lineage are present throughout Moldova and are commonly within transmission clusters, although this has thus far only been reported sporadically outside Moldova [26]. Prior work had found this lineage to be responsible for MDR-TB due to reinfection in nosocomial settings [27]; it is now apparent that these MDR strains are transmitted frequently in community settings. Regional reviews have suggested an important role of Beijing and Ural lineages in current TB epidemics [28]; our current work confirms and builds upon these insights, revealing in high resolution the overlapping dynamics of these 2 lineages in Moldova.
A major strength of our study was that we were able to include all culture-positive isolates across the country, minimizing challenges to transmission inference due to sampling biases. However, because we only could collect samples for 2 years-a short duration compared with the natural history of TB-our ability to track chains of transmission and to predict who infected whom was limited. We cannot rule out bias caused by individuals with TB that were never diagnosed or because some TB cases were not culture positive [29]. Additionally, polyclonal samples were removed from this analysis due to difficulties in producing well-resolved phylogenies. We do note that we found evidence for homogeneous and heterogenous drug resistance mutations in these sequences at a similar proportion to the remaining study population ( Table B in S1 Appendix). Further methods development and analysis are required to understand the potential role of polyclonal TB infection in transmission within Moldova.
There are urgent clinical and public health implications of these findings. While the crisis of transmitted MDR-TB was already apparent in this region, these data reveal that there are several cocirculating highly drug-resistant TB clades that differ in terms of drug resistance profiles, geographic distribution, and epidemic trajectory. These results suggest the urgency of interrupting MDR-TB transmission in Moldova, especially within specific geographic foci in the capital city of Chisinau and in the region of Transnistria. While the role of genomic surveillance for informing TB interventions in high-burden settings remains incompletely explored, this study provides an important example of how such information may be used to understand the complex epidemiology of MDR-TB in a high incidence country. We must next investigate whether this improved understanding of local transmission can inform the design of more effective and efficient interventions, a question which remains unanswered at this time.
Supporting information S1 STROBE Checklist. STROBE Statement-Checklist of items that should be included in reports of observational studies. STROBE, STrengthening the Reporting of OBservational studies in Epidemiology. (DOCX) S1 Data. Additional demographic and epidemiological data used in the analysis. (CSV) S1 Appendix. Demographic associations in cases belonging to large transmission clusters (10 cases), identified with patristic distance thresholds of 0.001 and 0.0005. Cases in small clusters (2 to 9 cases) are not included. ORs are calculated using logistic regression and P values by Wald chisquared test, adjusted for age and sex. Table F: Results of the Coalescent Bayesian Skyline analyses of the 3 large clades with specific resistant mutations using an uncorrelated log normal relaxed clock model. Table G: Complete Coalescent Bayesian Skyline results of the sensitivity analysis using 3 different clock model settings (strict, log normal relaxed, and exponential relaxed) and 3 clock rate estimates of the 3 large clades with specific resistant mutations. The clock rate used log normal distribution. Table H: Pooled Bayesian meta-analysis inference for each exponentiated effect (i.e., ratio of expected SNP distances per specified change in covariate value). Posterior means and 95% quantile-based credible intervals are presented. Fig A: The study flow diagram. Fig B: Distribution of the proportion of MDR-TB by the regions where they were diagnosed. (a) Regions sorted by the proportion of MDR-TB and (b) the total numbers of MDR-TB isolates from high to low. Fig C: (a) A scatterplot showing the pairwise SNP distance (max. 50 SNP differences) plotted against the patristic distance on an M-L phylogeny produced with RAxML between all 2,236 Moldovan isolates with whole genome sequence data. (b) A scatterplot showing the pairwise SNP distance (max. 50 SNP differences) plotted against the patristic distance on an M-L phylogeny produced with RAxML between 1,834 nonmixed Moldovan isolates with whole genome sequence data. Fig D: (a) The pairwise SNP distance in 35 large transmission clusters with at least 10 participants involved with the threshold of 0.001. The box plot shows the IQR and median SNP distance of each cluster. (b) The pairwise SNP distance in 26 large transmission clusters with at least 10 participants involved with the threshold of 0.0005. The box plot shows the IQR and median SNP distance of each cluster. Cohen.