Transition bias influences the evolution of antibiotic resistance in Mycobacterium tuberculosis

Transition bias, an overabundance of transitions relative to transversions, has been widely reported among studies of the rates and spectra of spontaneous mutations. However, demonstrating the role of transition bias in adaptive evolution remains challenging. In particular, it is unclear whether such biases direct the evolution of bacterial pathogens adapting to treatment. We addressed this challenge by analyzing adaptive antibiotic-resistance mutations in the major human pathogen Mycobacterium tuberculosis (MTB). We found strong evidence for transition bias in two independently curated data sets comprising 152 and 208 antibiotic-resistance mutations. This was true at the level of mutational paths (distinct adaptive DNA sequence changes) and events (individual instances of the adaptive DNA sequence changes) and across different genes and gene promoters conferring resistance to a diversity of antibiotics. It was also true for mutations that do not code for amino acid changes (in gene promoters and the 16S ribosomal RNA gene rrs) and for mutations that are synonymous to each other and are therefore likely to have similar fitness effects, suggesting that transition bias can be caused by a bias in mutation supply. These results point to a central role for transition bias in determining which mutations drive adaptive antibiotic resistance evolution in a key pathogen.


34
Whether and how transition bias influences adaptive evolution remain open questions. We 35 studied 296 DNA mutations that confer antibiotic resistance to the human pathogen 36 Mycobacterium tuberculosis. We uncovered strong transition bias among these mutations and 37 also among the number of times each mutation has evolved in different strains or geographic 38 locations, demonstrating that transition bias can influence adaptive evolution. For a subset of 39 mutations, we were able to rule out an alternative selection-based hypothesis for this bias, 40 indicating that transition bias can be caused by a biased mutation supply. By revealing this bias 41 among M. Tuberculosis resistance mutations, our findings improve our ability to predict the Introduction 60 Mutation creates genetic variation and therefore influences evolution. Mutation is not an entirely 61 random process, but rather exhibits biases toward particular DNA sequence changes. For 62 example, a bias toward transitions (purine-to-purine or pyrimidine-to-pyrimidine changes), 63 relative to transversions (purine-to-pyrimidine or pyrimidine-to-purine changes), has been widely 64 reported among studies of mutations spreading under relaxed selection (1-5).

66
Demonstrating the role of such transition bias in adaptive evolution remains challenging, with 67 most existing evidence derived from individual case studies (6-10). Stoltzfus and McCandlish 68 recently reported the first systematic study of transition bias in putatively adaptive evolution, 69 using the repeated occurrence of amino acid replacements in laboratory or natural evolution as 70 evidence that the replacements are adaptive (11). Their meta-analysis provides compelling 71 evidence that transition bias influences adaptive evolution, with transitions observed in at least 72 two-fold excess of the null expectation that they occur once for every two transversions. Yet such 73 analyses have two limitations. First, while the repeated occurrence of an amino acid replacement 74 is highly suggestive of adaptation (12), it is not direct evidence of adaptation. Second, an 75 overabundance of transitions could result from a bias in mutation supply (i.e., mutation-based 76 transition bias), from a greater selective advantage conferred by the amino acid replacements 77 caused by transitions relative to those caused by transversions (i.e., selection-based transition 78 bias), or from both. For example, the spontaneous deamination of methylated cytosines to 79 thymines may contribute to mutation-based transition bias (13), whereas the propensity of non-80 synonymous transitions to conserve the biochemical properties of amino acids better than non-81 synonymous transversions may contribute to selection-based transition bias (14). Discriminating 82 amongst these two forms of transition bias has proven challenging to date (15,16).

84
Despite the emerging evidence that transitions are overrepresented among adaptive mutations as 85 well as those spreading under relaxed selection, it remains unclear whether this plays a role in 86 real-world scenarios where rapid adaptive evolution has important implications for treatment of 87 infectious disease, such as the emergence of antibiotic resistance in pathogenic bacteria. For 88 example, if pathogen populations fix the first resistance mutation that appears ('first-come-first-89 served') (17), then a mutation supply biased toward particular types of nucleotide substitutions 90 will influence which genetic changes drive adaptation. Alternatively, if many beneficial 91 mutations are available to selection, and pathogens fix those with the highest selective advantage 92 ('pick-the-winner'), a bias in mutation supply would have a weaker impact on which genetic 93 changes drive adaptation. Therefore, identifying transition bias in such scenarios would improve 94 our basic understanding of how resistance evolves, and our ability to predict the relative 95 likelihoods of alternative mutational pathways to resistance.

97
Here, we study transition bias in the evolution of antibiotic resistance in Mycobacterium 98 tuberculosis (MTB), a major human pathogen for which antibiotic resistance evolution is a key 99 obstacle to effective treatment (18). We do so using two independently curated datasets of 100 mutations that are known to confer antibiotic resistance and are therefore definitively adaptive.

101
Additionally, we test specifically for mutation-based transition bias by considering two subsets of 102 adaptive mutations: 1) Mutations located in gene promoters and in the ribosomal gene rrs, which 103 is not translated to protein and therefore should not be influenced by selection-based bias caused 104 by transitions encoding different amino acid changes than transversions. 2) Mutations that are 105 synonymous to each other and are therefore likely to have similar fitness effects.

107
Our results reveal strong transition bias in the mutational paths to antibiotic resistance and in the 108 number of times each mutational path is used in the evolution of antibiotic resistance, across 22 109 genes or gene promoters that confer resistance to 11 antibiotics. We also observe transition bias among adaptive mutations that do not code for amino acid changes, and among adaptive 111 mutations that are synonymous to each other, consistent with the hypothesis that transition bias is 112 at least partly mutation-based. We therefore demonstrate that transition bias influences adaptive 113 evolution, specifically the evolution of antibiotic resistance in a key global pathogen.

116
We curated a dataset of 152 unique point mutations that confer resistance to at least one of 11 117 different antibiotics, and that appeared in at least one of 9,351 publicly-available MTB genomes 118 (Materials and Methods). We refer to this dataset as the Basel dataset. We also analyzed an 119 independently curated dataset of 208 unique point mutations that confer resistance to at least one 120 of 8 antibiotics and appeared in at least one of 5,310 MTB genomes (19)

152
To understand why, assume there was no mutation-based or selection-based transition bias and 153 the high transition:transversion ratio at the level of mutational paths was instead due to chance, 154 such as a sampling artifact. In this scenario we would expect at most the same transition bias at 155 the level of events, rather than the roughly two-fold increase that we observed.

157
To determine which mutations explained the observed transition bias, we calculated the relative 158 rates of all six possible nucleotide pair mutations, accounting for GC content (1), which is

165
Because the influence of transition bias might depend on the mechanism of antibiotic resistance, 166 we next tested for transition bias separately for different antibiotics. This reduced the number of 167 mutational paths and events that could be analyzed, so we first determined the antibiotics for 168 which we had sufficient statistical power to ensure that an observed lack of transition bias was not 169 due to reduced sample size (Materials and Methods). This analysis revealed that, at the level of

186
We observed transition bias in the number of mutational events associated with the evolution of 187 resistance to almost all antibiotics, with transition bias indicated by transition:transversion ratios 188 ranging from 1.59 for pyrazinamide to 41.6 for kanamycin (i.e., from more than three-fold to 189 more than eighty-fold excess of the null expectation; Table 1). The glaring exception was

196
The above results suggest transition bias influences the evolution of antibiotic resistance in MTB.

197
However, it remains unclear whether this bias is mutation-based or selection-based. To

230
Ideally, we would have sufficient mutational paths or events to perform the above analysis on all 231 of the amino acid changes in Table 2. To this end, we analyzed an older dataset, TBDReaMDB

257
First, quantifying the overabundance of transitions improves our ability to predict mutational 258 pathways to resistance. MTB often acquires multiple resistance mutations sequentially, and some 259 mutational trajectories are more common than others (23). Our results suggest the probability of 260 following a given trajectory will be higher when it contains a greater fraction of transitions than 261 alternative trajectories encoding similar resistance phenotypes.

262
Second, for at least part of our data, we excluded an important potential explanation for transition 264 bias, specifically that transitions encode amino acid changes with different average fitness effects 265 than transversions. We did this by showing that the bias extended to mutations in gene promoters 266 and the ribosomal gene rrs, and to mutations that are synonymous to each other. We still cannot 267 rule out variable fitness effects that are not linked to amino acid changes, as observed for 268 streptomycin-resistance mutations in ribosomal genes (21)

280
Third, if we then accept that a biased mutation supply explains at least some of the observed bias 281 among mutational paths to resistance, this is consistent with a role for mutation-limited 'first-  events. In this case, the transition:transversion ratio among events will tend toward that of the 310 adaptive mutation rate, and transitions will be overrepresented if there is mutation-based 311 transition bias. However the transition:transversion ratio of observed paths will tend toward the 312 ratio for all possible unique adaptive mutations, which would approximate 0.5 even when there is 313 mutation-based transition bias (11). Note the observed ratio at the level of paths, while weaker 314 than at the level of events, still exceeded 0.5. This could be because our datasets are incomplete samples of the possible paths to resistance in MTB, and the likelihood of a given path featuring in 316 our datasets is higher if it occurs more frequently (i.e. mutation-based bias and intermediate 317 sample size). Alternatively, our datasets may capture the vast majority of paths to resistance, but 318 due to selection-based bias a greater fraction of them are transitions than transversions.

320
The bias we observed was also prevalent at the level of individual nucleotide changes, with A/T > 321 G/C transitions being particularly common. This is in contrast to earlier evidence that G/C > A/T

333
In conclusion, our data support the hypothesis that a bias toward transitions plays a key role in 334 determining the genetic changes driving antibiotic resistance evolution.

337
The Basel dataset

338
We curated a list of mutations known to confer resistance to one or more of the following drugs 339 or drug classes: isoniazid, ethionamide, rifampicin, ethambutol, pyrazinamide, fluoroquinolones, excluded mutations in rpsA and ahpC, because these genes were unlikely to confer resistance to 342 pyrazinamide and isoniazid respectively (36, 37). We then added gyrB to the list of pertinent 343 genes, because some mutations in this gene have been shown to lead to fluoroquinolone 344 resistance (38).

346
We included additional mutations if they met one or more of the following criteria: 1) they have  (Table S1), of which we found 152 in at 353 least one of 9,351 publicly-available MTB genomes (Table S2) (39). These are the mutational 354 paths in the Basel dataset (Table S3).

356
We determined the mutational events by first calling single nucleotide polymorphisms in the 357 9,351 genomes and then using the polymorphisms to reconstruct the genomes' phylogeny and to 358 infer mutational gains and losses throughout the phylogeny, as follows. We clipped Illumina

371
Genomes were excluded if 1) they had an average coverage < 20x, 2) more than 50% of their 372 single nucleotide polymorphisms were excluded due to the strand bias filter, 3) more than 50% of 373 their single nucleotide polymorphisms had a percentage of reads supporting the call between 10% 374 and 90%, or 4) they contained single nucleotide polymorphisms that belong to different MTB  Table 4 in ref (19)), and searched for these polymorphisms in 5,310 Table 5 in ref (19)). We filtered these polymorphisms to only include point mutations, resulting in 394 a dataset of 208 mutational paths. Manson et al. (19) calculated the number of events per 395 mutational path by reconstructing the phylogeny of the 5,310 MTB genomes and using a 396 parsimony-based analysis to determine mutational gains and losses throughout the phylogeny. We 397 used their estimates of the number of events per mutational path, as they were reported in 398 Supplementary

401
TBDReaMDB is a dataset of 1,178 mutational paths associated with resistance to at least one of 9 402 antibiotics. We filtered this dataset to only include mutational paths that (1)

409
We used this dataset, which includes a greater number of mutations than the Basel and Manson 410 datasets, but with less strict inclusion criteria, to study transition bias amongst amino acid 411 changes that can be caused by mutational paths that are transitions or transversions, and that arise 412 from the same codon (Table 2) Table 2). Some are observed multiple times (at different loci), 418 giving a total of 28 such observations (i.e., the sum of the values in the rightmost column of Table  2). The expected probability of a transition under the null model is (1/ ! ) where q i is the number of times each mutation (path) is observed at different loci 421 (events). For the m = 6 observed mutational paths, this gives a null transition probability of

437
MTB has a high GC content, 65.6% genome-wide (48) and 64.4% in the 17 genes associated with 438 resistance in the Basel and Manson datasets. Thus, we expected to see more mutations from G/C 439 or C/G than from A/T or T/A, simply because there are more Gs and Cs in the genes associated 440 with resistance in our datasets. To control for this effect in our calculation of the relative rates of 441 the six possible nucleotide pair mutations, we followed the method of Hershberg and Petrov (1).

442
Specifically, we first determined the number of mutations of each type we would expect under

Basel dataset
Manson dataset