Variation among strains of Borrelia burgdorferi in host tissue abundance and lifetime transmission determine the population strain structure in nature

Pathogen life history theory assumes a positive relationship between pathogen load in host tissues and pathogen transmission. Empirical evidence for this relationship is surprisingly rare due to the difficulty of measuring transmission for many pathogens. The comparative method, where a common host is experimentally infected with a set of pathogen strains, is a powerful approach for investigating the relationships between pathogen load and transmission. The validity of such experimental estimates of strain-specific transmission is greatly enhanced if they can predict the pathogen population strain structure in nature. Borrelia burgdorferi is a multi-strain, tick-borne spirochete that causes Lyme disease in North America. This study used 11 field-collected strains of B. burgdorferi, a rodent host (Mus musculus, C3H/HeJ) and its tick vector (Ixodes scapularis) to determine the relationship between pathogen load in host tissues and lifetime host-to-tick transmission (HTT). Mice were experimentally infected via tick bite with 1 of 11 strains. Lifetime HTT was measured by infesting mice with I. scapularis larval ticks on 3 separate occasions. The prevalence and abundance of the strains in the mouse tissues and the ticks were determined by qPCR. We used published databases to obtain estimates of the frequencies of these strains in wild I. scapularis tick populations. Spirochete loads in ticks and lifetime HTT varied significantly among the 11 strains of B. burgdorferi. Strains with higher spirochete loads in the host tissues were more likely to infect feeding larval ticks, which molted into nymphal ticks that had a higher probability of B. burgdorferi infection (i.e., higher HTT). Our laboratory-based estimates of lifetime HTT were predictive of the frequencies of these strains in wild I. scapularis populations. For B. burgdorferi, the strains that establish high abundance in host tissues and that have high lifetime transmission are the strains that are most common in nature.


Section 1 -Additional Methods
Strains of Borrelia burgdorferi: The strains of Borrelia burgdorferi were obtained from the National Microbiology Laboratory (NML) in Winnipeg, which is part of the Public Health Agency of Canada (PHAC).These strains were originally isolated from questing I. scapularis ticks that were collected at various field sites across Canada [1].To obtain single-strain colonies, tick homogenates were cultured in semi-solid agar (passage 1).For each tick, two colonies were cut out of the agar and grown in vials containing liquid BSK-H (passage 2).The methods for sampling the strains and creating the single-strain colonies are described in [1].Once we had selected the set of B. burgdorferi strains that we wanted to work with, the strains were grown up at the NML (passage 3) before being shipped to the University of Saskatchewan (USask).At USask, we grew each strain in BSK-H medium (passage 4) before needle inoculation into C3H/HeJ mice (see below).Thus, each isolate was passaged 4 times from the original isolation from field-collected I. scapularis ticks to needle-inoculation into mice.
The single strain colonies were subsequently whole genome sequenced by PHAC and the strains were classified according to their multi-locus sequence type (MLST) and ospC type [1].MLST is determined by the allelic identity at eight housekeeping loci: clpA, clpX, nifS, pepX, pyrG, recG, rplB, and uvrA [2,3].The identity of alleles at these eight housekeeping loci and the resultant MLST can be determined using PubMLST [4].The MLST and ospC types for each of the 11 B. burgdorferi strains were published in [1].
Corrections to genetic identity of B. burgdorferi strains: The NML in Winnipeg sent us a manifest that indicated the MLST and the ospC groups for the 11 B. burgdorferi strains, which we reported in [5].In comparing this manifest to the supplementary material of [1], we realized that the manifest contained some errors with respect to the ospC types for two strains, Bb16-167 and Bb16-174.In the present study, we report that the ospC types for strains Bb16-167 and Bb16-174 are M and T (Table 1), respectively, and not C and D, as was reported in [5].
We checked the identity of the B. burgdorferi strains by downloading the original sequence data from [1] and found the same MLSTs as reported by [1].We compared the original ospC sequence data from [1] to the reference sequences listed in [6].For strain Bb16-57-1, we discovered that the ospC gene that had been annotated as type Y in [1], is annotated as type A3 in [6].These two ospC types are easily confused because the reference sequences are >98% similar (Y: HM047875.1;A3: EF592541.1),which is much less than the > 8% criterion for distinguishing among different ospC types [6,7].For simplicity, in the present study, we report that the ospC type for strain Bb16-57-1 is A3 (Table 1) and not Y, as was reported in [1] and [5].
In-house Sanger sequencing of MLST and ospC: We tested the MLST identity for 3 of our 11 strains, Bb16-10-2, Bb16-167, and Bb16-174.We used DNA from the B. burgdorferi strain cultures that were inoculated into the mice to create the I. scapularis nymphs used to experimentally infect the mice in the present study.We PCR-amplified the eight housekeeping genes using an established protocol [2,3] and sent the amplicons for Sanger sequencing.This inhouse sequencing confirmed that the MLSTs of the these 3 cultures were identical to the MLSTs identified by [1].
To further confirm the ospC type of our strains, we used a nested PCR to amplify the ospC gene from various sources including cultures of B. burgdorferi, tissues of experimentally infected mice, engorged larval ticks, and engorged nymphal ticks.The nested PCR protocol for the ospC gene was described in [8], and the purified amplicons were sent for Sanger sequencing.We obtained high quality Sanger sequences for 43 samples including 10 cultures of B. burgdorferi, 3 mouse tissues, 13 engorged larval ticks, and 17 engorged nymphal ticks.Of the 43 ospC sequences, 39 were identical to the ospC type in [1].For strain Bb16-10-2, there was a mismatch in ospC type between [1] and our in-house Sanger sequencing of the ospC amplicons for 4 infected ticks.The whole genome sequence in [1] was annotated as ospC type H, but the forward and reverse Sanger sequences of our 4 infected ticks clustered with ospC type C3.
In an earlier version of this manuscript, we had reported a mismatch in ospC type between [1] and our in-house sequencing for strains Bb16-10-2 and Bb16-57-1.A reviewer suggested that strains Bb16-10-2 and Bb16-57-1 might be mixed strain infections.As mentioned above, Bb16-57-1 had an unambiguous ospC sequence that could be called A3 or Y, so this is not a mixture of strains.Nevertheless, to address the concern of possible mixed strain infections, we amplified the ospC gene for each strain from 24 infected mouse tissues (8 mice x 3 tissues per mouse) and 24 infected unfed nymphs (8 mice x 3 unfed nymphs per mouse; nymphs had taken their larval blood meal from the mice) and sent the amplicons for Sanger sequencing.We obtained 141 high quality Sanger sequences of the ospC gene (69 forward and 72 reverse; 51 from mouse tissues and 90 from unfed nymphs), of which 87 and 54 were from strains Bb16-10-2 and Bb16-57-1, respectively.The ospC sequences were evenly distributed over the 16 mice with a median of 10 sequences per mouse (range = 5 to 10).The ospC sequences were matched to the reference sequences A3_EF592541 and C3_EF592543, and the mean % identical sites was 99.9% (range = 98.8 -100.0%).All 87 ospC sequences from the mice infected with strain Bb16-10-2 clustered with ospC type C3 with a mean % similarity of 99.93% (range = 98.8 -100.0%).All 54 ospC sequences from the mice infected with strain Bb16-57-1 clustered with ospC type A3 with a mean % similarity of 99.95% (range = 99.2-100.0%).In summary, we are very confident that strains Bb16-10-2 and Bb16-57-1 are not mixed infections and that the ospC types of the strains that we used in our study are C3 and A3, respectively and not H and Y, as originally reported in Zinck et al (2022).
Switch of strains Bb16-178-2 and Bb16-178-1: At the start of our study, we had requested strain Bb16-178-2, which according to [1] has MLST 12, ospC type O, and ID 2421 in the PubMLST database.The in-house genetic testing of the strain that we had received found MLST 3 and ospC type K.According to [1], strain Bb16-178-1 has MLST 3 and ospC type K. Strain Bb16-178-1 and strain Bb16-178-2 were isolated from the same adult male I. scapularis tick collected from Lunenberg, Nova Scotia.This suggests that we received strain Bb16-178-1 instead of Bb16-178-2.This error was detected after we started our experimental infection and was pointed out in an earlier article published on this study [5].It is for this reason, that our study contains two strains, Bb16-178-1 and Bb16-198, that are in fact different, but are both MLST 3 and ospC type K.
Larval infestation of mice: Specific pathogen-free Ixodes scapularis larval ticks were purchased from the National Tick Research and Education Resource at Oklahoma State University.At days 30, 60, and 90 post-infection (PI), mice were infested with 50-100 I. scapularis larvae.All mice were anaesthetized for ~20 min with ketamine and xylazine for each infestation, where larvae were brushed onto their fur.During each infestation, mice were housed in their mouse cage with the plastic top removed and placed on risers within a rat cage containing a shallow moat of water.As a precaution, a 1-inch-tall strip of petroleum jelly was put around the inner wall of each rat cage.Engorged larvae climbed out of the mouse cage and fell into the water moat of the rat cage where they were easily recovered.At 5 days post-infestation, most larvae were in the water moat, and no larvae were found in the bedding by 7 days postinfestation.At 7-8 days post-infestation, the mice were housed under normal conditions.For each mouse and infestation, 5 engorged larvae were frozen immediately at -80 ºC following recovery (this was not done for the first infestation in block 1; it was done for infestations 2 and 3 in block 1 and for infestations 1, 2, and 3 in block 2).The remaining engorged larvae were placed into 50 mL Falcon tubes where they were allowed to moult into nymphs.These tubes had perforated lids and a double layer of nylon stocking to allow air exchange.The centrifuge tubes were placed in sealed clear plastic containers that had a layer of standing ultrapure water (Milli-Q) to ensure a high relative humidity (~100%).Four weeks after the engorged larvae moulted into nymphs, a sample of 10 nymphs per mouse per infestation were frozen at -80 ºC for later DNA extraction.
DNA extraction of whole Ixodes scapularis ticks: For each mouse, there were 45 ticks (3 infestations * (5 larvae + 10 nymphs) per infestation) for a total of 5,040 tick DNA extractions (112 mice x 45 ticks per mouse = 5,040 ticks).Ticks were homogenized using bead beating with the Qiagen TissueLyser II.First, ticks were placed into 5 µL of nuclease free water at the bottom of a 2.0 mL MP BIO disruption tube.These were then frozen for 1-2 hrs at -80 ºC before adding a 1.6 mm stainless steel bead, and refreezing inverted at -80 ºC for a minimum of 24 hours.This allows the tick to freeze in place at the center of the tube bottom but keeps the bead from freezing to the tick.The ticks were then disrupted with the TissueLyser II with two runs at 30 Hz for 45 seconds, flipping the two racks of tubes in between runs.DNA was extracted from the homogenized ticks using the Qiagen DNEasy 96-well plate extraction kit and following the manufacturer's instructions.The disrupted ticks were extracted following the kit guidelines with some modifications.The plate extraction was done in batches of 192 samples per run.Each run included 8 no sample controls (i.e., with no DNA substrate added), and experimental controls (ticks from uninfected control mice).The proteinase K digestion step was done overnight in the MP BIO tubes without removing the disruption bead in a shaking incubator (56 ºC, 400 rpm).Samples were then moved into the Qiagen kit plates for the remainder of the extraction.All extractions were eluted into a final volume of 65 µL and stored at -80 ºC prior to use in qPCR.
Statistical software: We used R version 4.0.4 with RStudio version 1.4.1103 for all statistical analyses [9,10].We used the lmer() function in the lme4 package to run the LMMs, and glmer() to run the GLMMs [11].The Anova() function in the car package was used to run the Wald tests of statistical significance [12].Post-hoc analysis and calculation of estimated marginal means were done using emmeans(), pairs(), pwpm() functions in the emmeans package [13] and ggemmeans() and ggpredict() in the ggeffects package [14].Correlations were calculated using the cor.test() function in the base package.Correlation matrices and correlation plots were calculated using the rcorr() function in the Hmisc package [15].

Section 2 -Synthetic standards used in qPCR
Synthetic 23S rRNA gene standard: The presence and abundance of B. burgdorferi in the I. scapularis ticks was estimated using real-time PCR that targeted the 23S rRNA gene of B. burgdorferi and was previously described [16].To translate the quantification cycle (Cq) values produced by the 23S rRNA qPCR to estimates of the abundance of B. burgdorferi requires a standard curve.We created this standard curve by repeatedly performing the 23S rRNA qPCR on a dilution series of a synthetic 23S rRNA gene of known concentrations.The synthetic standard was purchased as a gblock from IDTDNA.The standard was based on the B. burgdorferi B31 reference sequence (Genbank Accession number: AE000783.1)to encompass the region targeted by the 23S rRNA primers (Table S1).A total sequence length of 181 nucleotides (nt) was used to ensure that the forward and reverse primer sites were not at the ends of the gblock where degradation could impact binding.To be able to check for contamination, the 23S rRNA gene sequence was modified from the reference genome to allow for two methods of differentiation between the 23S rRNA gene of B. burgdorferi and the synthetic standard.A total of 30 nt were inserted in the synthetic fragment between the forward and reverse priming sites to allow for product discrimination on an agarose gel; the amplicon lengths of the synthetic standard (105 nt) and the 23S rRNA gene of B. burgdorferi strain B31 (75 nt) differ by 30 nucleotides.A BamHI restriction enzyme cut site was included in the 30-nucleotide insert so that a restriction digest would cleave the synthetic standard, but not a copy of the real 23S rRNA gene.The gblock was initially suspended in Buffer AE (Qiagen DNEasy kit) with 0.1 mg/mL tRNA (Sigma-Aldrich: 10109517001).This was done so that the standard was in the same suspension buffer as all the experimental samples.The tRNA was added to reduce the impact of DNA adsorption to the walls of the qPCR tubes in dilutions with a low concentration of gblock 23S rRNA.
Table S1.The sequences used in creating the 23S rRNA synthetic standard as well as the primer and probe sequences for the 23S rRNA qPCR.All sequences are reported in the 5'-3' direction.The 23S rRNA forward primer and the 23S rRNA reverse primer are highlighted in yellow and green, respectively.The 30-nucleotide insert sequence and the BamHI cut site are shown in red and purple, respectively.

Synthetic tick calreticulin gene standard:
The abundance of the tick genome in the tick DNA extractions was estimated using a real-time PCR that targeted the calreticulin gene of I. scapularis.A synthetic gblock standard was created for the calreticulin gene of I. scapularis ticks following the same rationale as above.This standard was based on the reference sequence (Genbank Accession number: AY690335.1)to encompass the region targeted by the calreticulin primers (Table S2).The total sequence length of the synthetic tick calreticulin gene is 177 nt.Like the 23S rRNA standard, the internal sequence of the calreticulin standard was modified by adding a 33-nucleotide insert that contained a BamHI cut site.This 33-nt insert will allow us to differentiate the standard from a real copy of the calreticulin gene.The calreticulin standard was suspended in the same manner as the 23S rRNA standard.Table S2.The sequences used in creating the synthetic tick calreticulin standard as well as the primer and probe sequences for the tick calreticulin qPCR.All sequences are reported in the 5'-3' direction.The calreticulin forward primer and the calreticulin reverse primer are highlighted in yellow and green, respectively.The 33-nucleotide insert sequence and the BamHI cut site are shown in red and purple, respectively.

Infection prevalence in the mouse tissues:
The tissue infection prevalence and the tissue spirochete loads of the 84 infected C3H/HeJ mice in this study have been published elsewhere [5].To facilitate interpretation, we briefly present here the infection prevalence and the spirochete loads of the 3 ear biopsies and the 7 necropsies.The ear biopsies were sampled from the right ear on days 29, 59, and 89 post-infection (PI), whereas the 7 necropsies were sampled on day 97 PI.The 7 necropsies were the ventral skin, left ear, right ear, heart, bladder, joint, and kidney.The infection status and spirochete load of each mouse tissue were determined using qPCR targeting the 23S rRNA gene.The spirochete loads were standardized relative to a million mouse Beta-actin gene copies.These standardized spirochete loads were transformed as follows: log10[(23S rRNA copies/10^6 Beta-actin copies) + 1].qPCR is an exponential process and log transformation normalizes the residuals and gives better estimates of the mean.One problem with log transformation is that it cannot handle uninfected samples that have zero 23S rRNA copies.For this reason, we used the log[X + 1] transformation, where all uninfected values have a value of zero (i.e., log[0 + 1] = log [1] = 0).For each mouse tissue, we calculated the percentage of infected tissues and the mean spirochete load with a 95% confidence interval on the log10 scale and the original scale.In Zinck et al (2022), the mean spirochete loads were calculated over the subset of infected tissues (i.e., uninfected tissues were excluded).For this reason, the mean tissue spirochete loads in the present study are different from Zinck et al (2022).
The infection prevalence in the biopsies from the right ear on days 29, 59, and 89 PI were 98.8%, 100.0%, and 91.7%, respectively (Table S3).For the necropsies on day 97 PI, the infection prevalence varied among tissue types.Skin-related tissues, such as the ventral skin, left ear, and right ear had low infection prevalence with 52.4%, 38.1%, and 14.3%, respectively.In contrast, the heart, bladder, and joint had high infection prevalence with 92.9%, 97.6%, and 71.4%, respectively.An unexpected result is that the infection prevalence in the right ear changed from 91.7% in the biopsy on day 89 PI to 14.3% in the necropsy on day 97 PI.In other words, over the course of 8 days, the infection prevalence in the right ear decreased 6.4-fold.
Spirochete load in the mouse tissues: The mean spirochete loads for the mouse tissues are shown in Table S3.The mean spirochete load was expressed as log10[(23S rRNA copies/10^6 Beta-actin copies) + 1] for each mouse tissue (Mean1 in Table S3).The advantage of the log[X + 1] transformation is that it allows uninfected samples to be defined on the log scale; these samples have a value of 0 (i.e., log[X + 1] = log[0 + 1] = log[1] = 0).Thus, for each mouse tissue, the mean is based on all 84 mice and includes both infected and uninfected tissues.The mean log[X + 1] values were back-transformed to indicate the number of 23S rRNA copies/10^6 Beta-actin copies (Mean 2 in Table S3).
For the biopsies of the right ear, the mean spirochete load decreased over the course of infection.The mean spirochete load of biopsy 1 on day 29 PI (2878.5 23S rRNA copies/10^6 Beta-actin copies) was 3.5x higher compared to the mean spirochete load of biopsy 3 on day 89 PI (812 23S rRNA copies/10^6 Beta-actin copies).For the necropsies on day 97 PI, there were dramatic differences in mean spirochete load.The right ear had the lowest spirochete load (1.3 23S rRNA copies/10^6 Beta-actin copies) and the heart had the highest spirochete load (1537.8(1.3 23S rRNA copies/10^6 Beta-actin copies).If the right ear is taken as the reference tissue, the mean spirochete load in the kidney, left ear, ventral skin, joint, bladder, and heart is 3.2x, 8.0x, 14.3x, 32.6x, 76.9x, 824.2x, and 1182.9xhigher, respectively.Thus, the mean spirochete load in the skin-related organs (ventral skin, left ear, right ear) was orders of magnitude lower compared to the mean spirochete load in the internal organs (bladder, and heart).Again, we point out the remarkable change in spirochete load in the right ear from ear biopsy 3 on day 89 PI (812 23S rRNA copies/10^6 Beta-actin copies) to the necropsy on day 97 PI (1.3 23S rRNA copies/10^6 Beta-actin copies).In other words, over the course of 8 days, the mean spirochete load in the right ear decreased 624.6-fold.
Explanation for decrease in infection prevalence and spirochete load in right ear: The right ear was sampled with a biopsy on day 89 PI and a necropsy on day 97 PI.Over the course of 8 days, the infection prevalence of the right ear decreased 6.4-fold and the spirochete load of the right ear decreased 624.6-fold.The mice were injected with ketamine and xylazine on day 90 PI before being infested with I. scapularis larvae.A recent study has demonstrated that ketamine kills and inhibits the growth of B. burgdorferi in vitro [17].Thus, one explanation is that the ketamine injection on day 90 PI was responsible for the 6.4-fold decrease in infection prevalence and for the 624.6-fold decrease in spirochete load in the right ear from day 89 PI to day 97 PI.Table S3.Infection prevalence and mean spirochete load in the mouse tissues.For each of the 10 mouse tissues, the following are shown: number of infected tissues and the total number of tissues tested (Inf/Total), % of infected tissues, mean spirochete load on the log[X + 1] scale (Mean1), standard deviation (SD1), and standard error (SE1; based on sample size of 84), and lower limit (LL1) and upper limit (UL1) of the 95% confidence interval.Section 4 -Analyses of the larval infection prevalence (LIP)

GLMM of LIP:
The larval infection prevalence (LIP) is the proportion of engorged larvae infected with B. burgdorferi.The analysis of the LIP was performed on all the engorged larvae (n = 950 engorged larvae) that fed on the subset of infected mice (n = 84 mice).The response variable was the infection status for each engorged larva as measured by 23S rRNA qPCR.This binomial response variable (0 = uninfected, 1 = infected) was modelled using a generalized linear mixed effect model (GLMM) with binomial errors.The fixed factors were strain (11 levels), mouse sex (2 levels: male, female), infestation (3 levels: infestation 1, infestation 2, infestation 3), and temporal block (2 levels: A and B).Tick calreticulin gene copy number was not included in the model.The full model included all the two-way and the threeway interactions between strain, sex, and infestation.Temporal block was included as a main effect only.Mouse identity was included as a random factor.Non-significant interaction terms (at α = 0.05) were removed sequentially (Table S4).The final simplified model included no interactions (Table S4).
Strain (p = 1.346*10 -15 ), infestation (p = 3.359*10 -5 ), and temporal block (p = 8.088*10 - 10 ) but not mouse sex (p = 0.085) had significant effects on the LIP (Table S4).The parameter estimates for the final simplified model are shown in Table S5.The mean LIP for the levels of a given factor were calculated from the raw data by averaging over the levels of the other factors (Table S6).Estimated marginal means (EMMs) of the LIP and post-hoc tests were generated using the ggemmeans() function and the final simplified model (Table S6).The means from the raw data and the EMMs were similar (Table S6).The Pearson correlation between the means and the EMMs of the LIP across the 18 levels was positive and significant (Mean vs EMM in Table S6; r = 0.977, t = 18.36, df = 16, p = 3.563e -12).In what follows, we present the EMMs of the LIP and their post-hoc tests (Table S6).
Among strains of B. burgdorferi and averaged across mouse sex, infestation, and block, the mean LIP ranged from 32.89% for strain 54 to 97.10% for strain 178-1.All strains, except strain 54, had a significantly higher LIP compared to the reference strain 66 (Table S5).Averaged across B. burgdorferi strains, infestation, and block, the EMM LIP for male mice (79.72%) was higher compared to female mice (73.56%), but this difference was not significant (p = 0.085).Averaged across B. burgdorferi strains, mouse sex, and block, the EMM LIP for infestation 1 (88.96%) was significantly higher compared to infestation 2 (65.88%; p < 0.0001) and infestation 3 (69.94%;p = 0.0002).The EMM LIP for block B (86.48%) was significantly higher compared to block A (63.10%; p < 0.0001).
GLMM of LIP that exclude infestation 1 or block A: For infestation 1, engorged larvae were collected for block B, but not block A. We therefore re-analyzed the LIP for the data set that excluded infestation 1.The results were like Table S4; strain (p = 1.302*10 -13 ) and block (3.777*10 -09 ) remained highly significant, but infestation was no longer significant (p = 0.259).This result was expected because the significant effect of infestation was driven by differences between infestation 1 versus the other two infestations.We also re-analyzed the LIP for the data set that excluded block A. The results were like Table S4; strain (p = 1.031*10 -10 ) and infestation (4.123*10 -5 ) remained highly significant.In summary, the significant effects of infestation and temporal block on the LIP were independent of the uneven sampling of engorged larvae from infestation 1 between temporal blocks A and B.

Differences between temporal blocks:
The difference in LIP between the two temporal blocks (63.10% in block A versus 86.48% in block B) was unexpected.Importantly, as the main factors of interest, B. burgdorferi strain and mouse sex, were perfectly orthogonal with respect to block, differences between blocks do not influence differences between strains or the sexes.We also emphasize here that there was no effect of temporal block on the NIP and NSL.Thus, whatever factors caused the differences in LIP and LSL between temporal blocks, their importance disappeared once the larvae moulted into nymphs.

LMM of tick calreticulin:
We used an LMM to compare the tick calreticulin gene copy number between blocks.This analysis found that engorged larvae from block B contained significantly more calreticulin genes compared to engorged larvae in block A (p < 2e-16).One explanation is that there was variation in DNA extraction efficiency between the two blocks.We view this explanation as unlikely because all larvae had their DNA extracted at the same time.A second explanation is variation in quality of I. scapularis larvae that we received from Oklahoma State University.This laboratory colony is periodically outbred with wild I. scapularis ticks to maintain genetic variation.Over the 2-year duration of this study, we performed at least 8 different larval infestations, and we observed substantial variation in the quality and vigour of the larvae.Some batches of larvae were lively and motivated to feed, whereas other batches lacked vigour.In summary, variation among batches of larvae may have caused the observed differences in tick calreticulin gene copy number, LIP, and LSL.Importantly, these differences among blocks had no effect on the NIP and NSL.Table S4.Model simplification of the GLMM of LIP.Non-significant interactions were sequentially removed in order of increasing significance; the model was updated each time a non-significant term was removed.The final simplified model contained the fixed factors only.For each term, the order in which it was removed, the Chi-squared statistic (Chi), degrees of freedom, and p-value (p) are shown.For the non-significant interactions, the p-values are the ones in the model from which they were removed.For the included terms, the p-values are from the final simplified model.

Term
Order b Reference condition refers to engorged larvae from infestation 1 that fed on a male mouse in block A that was infected with B. burgdorferi strain 66.
Table S6.The means and EMMs of the LIP for each of the levels of the factors of interest.The LIP is the percentage of engorged larvae infected with B. burgdorferi.The means are from the raw data, whereas the EMMs were calculated using the final simplified model in Table S4.The mean or EMM of a given level was averaged over the levels of the other factors.Lower and upper (LCL, UCL) 95% confidence limits for the EMM are reported.

LMM of the LSL:
The larval spirochete load (LSL) is the number of 23S rRNA copies per infected engorged larvae.The analysis of the LSL was performed on the subset of engorged larvae that were infected with B. burgdorferi (n = 645 infected engorged larvae) that fed on the subset of infected mice (n = 84 mice).The LSL was log10-transformed to normalize the residuals.The log10 23S rRNA copies per engorged larva was modelled using a linear mixed effect model (LMM).The fixed factors were strain (11 levels), mouse sex (2 levels: male, female), infestation (3 levels: infestation 1, infestation 2, infestation 3), and temporal block (2 levels: A and B).The log10 calreticulin copies per tick was included as a covariate.The full model included all the two-way and three-way interactions between strain, sex, and infestation.Temporal block was included as a main effect only.Mouse identity was included as a random factor.The three-way interaction between strain, sex, and infestation was significant (p = 0.010), but none of the two-way interactions were significant (Table S7).A comparison of the AIC values found that the main effects model had an AIC value (1633.779)that was 36.4 units lower compared to the full interaction model (1670.186).For this reason, the parameter estimates and EMMs are presented for the more parsimonious main effects model (i.e., does not include any interactions; Table S8).
Strain (p = 5.968*10 -12 ), infestation (p = 4.773*10 -11 ), temporal block (p = 6.236*10 -05 ), and log10 calreticulin (p = 2.2*10 -16 ), but not mouse sex (p = 0.317) had significant effects on the LSL (Table S7).The parameter estimates for this final simplified model and their significance are shown in Table S8.Means and EMMs (using the final model) were generated for each factor level (Table S9).The means from the raw data and the EMMs were similar (Table S9).The Pearson correlation between the means and the EMMs of the LSL across the 18 levels was positive and significant (Mean1 vs EMM1 in Table S9; r = 0.944, t = 11.494,df = 16, p = 3.829e-09).The units of the LSL are the number of 23S rRNA gene copies per infected engorged larva.In what follows, we present the EMMs of the LSL and their post-hoc tests (EMM2 in Table S9).
The EMM LSL varied 11.8-fold among strains from 728.5 units for strain 54 to 15635.5 units for strain 178-1.Larvae that fed on female mice had an EMM LSL (4094.7 units) that was 23.3% higher compared to larvae that fed on male mice (3320.7 units), but this difference was not significant (p = 0.322).The EMM LSL of infestation 1 (8502.2units) was significantly higher compared to infestation 2 (2142.2units; p < 0.0001) and infestation 3 (2752.9units; p < 0.0001).Finally, the EMM LSL of larvae in block B (5808.5 units) was 2.5x higher compared to block A (2340.9 units; p = 0.0001).
The LSL had a positive and significant relationship with the number of calreticulin gene copies in the tick (Table S8: slope = 0.735, SE = 0.058, p < 2.2*10 -16 ).Thus, DNA extractions from engorged larvae that had higher levels of calreticulin also had higher levels of 23S rRNA (Figure S1).One methodological explanation for this result is variation in DNA extraction efficiency; high efficiency samples with more DNA will have more calreticulin and more 23S rRNA gene copies.A biological explanation for this result is variation in larval size; larger larvae (as estimated by a higher abundance of calreticulin) may acquire more spirochetes during their blood meal [18].A separate analysis of the ratio of the 23S rRNA gene copy number to the tick calreticulin copy number provided very similar results (see below).
GLMM of LSL that exclude infestation 1 or block A: For infestation 1, engorged larvae were collected for block B, but not block A. We therefore re-analyzed the LSL for the data set that excluded infestation 1.The results were like Table S7; strain (p = 1.015*10 -06 ) and block (1.280*10 -4 ) remained highly significant, but infestation was no longer significant (p = 0.146).This result was expected because the significant effect of infestation was driven by differences between infestation 1 versus the other two infestations.We also re-analyzed the LSL for the data set that excluded block A. The results were like Table S7; strain (p = 1.412*10 -13 ), infestation (1.125*10 -11 ), and log10(calreticulin) (2.2*10 -16 ) remained highly significant.In summary, the significant effects of infestation and temporal block on the LSL were independent of the uneven sampling of engorged larvae from infestation 1 between temporal blocks A and B. Table S7.Model simplification of the LMM of the LSL.Non-significant interactions were sequentially removed in order of increasing significance; the model was updated each time a term was removed.The final simplified model contained the fixed factors only.For each term, the order in which it was removed, the Chi-squared statistic (Chi), degrees of freedom (df), and p-value (p) are shown.For the non-significant interactions, the p-values are the ones in the model from which they were removed.For the included terms, the p-values are from the final simplified model.

Term
Order

Analysis of LSL standardized to calreticulin:
In the previous analysis of LSL, the tick calreticulin gene copy number was included as a covariate.In the present analysis, the response variable was the ratio of 23S rRNA gene copies per million tick calreticulin gene copies per tick (LSLCAL).This ratio was log10 transformed to standardize the residuals.As before, this analysis was performed on the subset of engorged larvae that were infected with B. burgdorferi (n = 645 engorged larvae) that fed on the subset of infected mice (n = 84 mice).The LSLCAL was modelled using an LMM.The fixed factors were strain (11 levels), mouse sex (2 levels: male, female), infestation (3 levels: infestation 1, infestation 2, infestation 3), and temporal block (2 levels: A and B).The full model included all the two-way and three-way interactions between strain, sex, and infestation.Temporal block was included as a main effect only.Mouse identity was included as a random factor.As before, the three-way interaction between strain, sex, and infestation was significant (p = 0.013), but none of the two-way interactions were significant (Table S10).For simplicity, we interpreted the parameter estimates for the main effects model (i.e., all interactions were removed).
Strain (p = 1.462*10 -12 ), infestation (p = 1.229*10 -13 ), and temporal block (p = 6.565*10 - 04 ), but not mouse sex (p = 0.406) had significant effects on the LSLCAL (Table S10).The parameter estimates for this final simplified model and their significance are shown in Table S11.The correlation in the parameter estimates of the 14 contrasts between Table S8 and Table S11 is positive and highly significant (r = 0.996, df = 12, t = 36.703,p = 1.073E-13).Thus, the results from these two analyses are very similar regardless of whether tick calreticulin gene copy number is included as an explanatory covariate in the analysis of the LSL (Table S7, S8) or in the denominator of the LSLCAL response variable (Table S10, Table S11).Table S10.Model simplification of the LMM of the LSLCAL.Non-significant interactions were sequentially removed in order of increasing significance; the model was updated each time a term was removed.The final simplified model contained the fixed factors only.For each term, the order in which it was removed, the Chi-squared statistic (Chi), degrees of freedom (df), and p-value (p) are shown.For the non-significant interactions, the p-values are the ones in the model from which they were removed.For the included terms, the p-values are from the final simplified model.

Term
Order b Reference condition refers to engorged larvae from infestation 1 that fed on a male mouse in block A that was infected with B. burgdorferi strain 66.
Section 6 -Analyses of the nymphal infection prevalence (NIP)

GLMM of NIP:
The nymphal infection prevalence (NIP) is the proportion of unfed nymphs infected with B. burgdorferi.The analysis of the NIP was performed on all the unfed nymphs (n = 2471 unfed nymphs) that fed as larvae on the subset of infected mice (n = 84 mice).The response variable was the infection status for each nymph as measured by 23S rRNA qPCR.This binomial response variable (0 = uninfected, 1 = infected) was modelled using a GLMM with binomial errors.The fixed factors were strain (11 levels), mouse sex (2 levels: male, female), infestation (3 levels: infestation 1, infestation 2, infestation 3), and temporal block (2 levels: A and B).The full model included all the two-way and the three-way interactions between strain, sex, and infestation.Temporal block was included as a main effect only.Mouse identity was included as a random factor.Non-significant, or non-functional interaction terms were removed sequentially (Table S12).Significant interactions were removed if they had standard errors that were orders of magnitude greater than the parameter estimates.Such problem interactions usually have marginally significant p-values (i.e., 0.01 < p < 0.05).Retention of such problem interactions in the model results in biased estimates of the EMMs.
The interaction between strain and sex (p = 0.022) and between strain and infestation (p = 0.017) were examples of problem interactions.Although these interactions were both statistically significant according to the Wald chi-square tests, inspection of the parameter estimates found that the contrasts for some strains had standard errors that were an order of magnitude larger than the parameter estimates.For this reason, these interactions were removed from the model.The final simplified model included the main effects and the interaction between sex and infestation (Table S12).The interaction between sex and infestation had a significant effect on the NIP (1.57*10 -4 ).Strain (p = 1.835*10 -14 ), mouse sex (p = 9.76*10 -4 ), infestation (p = 2.2*10 -16 ), but not temporal block (p = 0.568) had significant effects on the NIP (Table S12).
The parameter estimates for the final simplified model and their significance are shown in Table S13.The mean NIP for the levels of a given factor were calculated from the raw data by averaging over the other levels (Table S14).Estimated marginal means (EMMs) of NIP were generated for each factor level using the ggemmeans() function and the final simplified model (Table S14).The means from the raw data and the EMMs were similar (Table S14).The Pearson correlation between the means and the EMMs of the NIP across the 24 levels was positive and significant (Mean vs EMM in Table S14; r = 0.953, t = 14.806, df = 22, p = 6.379e -13).In what follows, we present the EMMs of the NIP and their post-hoc tests (Table S14).
The interaction between sex and infestation refers to the fact that the difference in NIP between males and females changes over the successive infestations (Figure 2 in the main manuscript).For infestation 1, males have a lower EMM NIP compared to females (98.29% vs 99.23%, p = 0.095) but the difference is not significant.Males have a significantly higher NIP compared to females for infestation 2 (93.22% vs 86.00%, p = 0.008) and infestation 3 (91.33%vs 76.45%, p < 0.0001).
The relationship between NIP and infestation is shown for each of the 11 B. burgdorferi strains in Figure 1 in the main manuscript and in Figure S2.In this study, the NIP is synonymous with host-to-tick transmission (HTT), but the latter term has as stronger connotation of being a pathogen life history trait.For this reason, the NIP is labelled as host-to-tick transmission (HTT) in Figure S2. Figure S2 shows the estimated marginal means and their 95% confidence intervals that were calculated for the 33 combinations of strain and infestation using the final simplified model and the ggemmeans() function.
Table S12.Model simplification of the GLMM of the NIP.Non-significant interactions were sequentially removed in order of increasing significance; the model was updated each time a term was removed.Significant interactions were removed if they had standard errors that were orders of magnitude greater than the parameter estimates.The final simplified model contained the fixed factors and the interaction of sex and infestation.For each term, the order in which it was removed, the Chi-squared statistic (Chi), degrees of freedom (df), and p-value (p) are shown.For the non-significant interactions, the p-values are the ones in the model from which they were removed.For the included terms, the p-values are from the final simplified model.b Reference condition refers to engorged larvae from infestation 1 that fed on a male mouse in block A that was infected with B. burgdorferi strain 66.
Table S14.The means and EMMs of the NIP for each of the levels of the factors of interest.The NIP is the percentage of unfed nymphs infected with B. burgdorferi (these nymphs took their larval blood meal from the experimentally infected mice).The means are from the raw data, whereas the EMMs were calculated using the final simplified model in Table S12 Figure S2.The lifetime host-to-tick transmission (HTT) of B. burgdorferi from infected mice to immature I. scapularis ticks decreased over the 3 successive larval infestations for some strains but remained constant over time for other strains.For this reason, the 2-way interaction between strain and infestation was significant (p = 0.017).Lifetime HTT was measured by infesting mice with I. scapularis larvae on 3 occasions (days 30, 60, and 90 post-infection), allowing the engorged larvae to moult into nymphs, and testing the nymphs for infection with B. burgdorferi.The mean values (green circles) for each infestation are the EMMs from the GLMM for the HTT of each strain over time.Error bars are the 95% confidence intervals for the EMMs.
Section 7 -Analyses of the nymphal spirochete load (NSL) LMM of the NSL: The nymphal spirochete load (NSL) is the number of 23S rRNA copies per infected unfed nymph.The analysis of the NSL was performed on the subset of unfed nymphs that were infected with B. burgdorferi (n = 2,084 infected unfed nymphs) and that fed on the subset of infected mice (n = 84 mice).The NSL was log10-transformed to normalize the residuals.The log10 23S rRNA copies per unfed nymph was modelled using an LMM.The fixed factors were strain (11 levels), mouse sex (2 levels: male, female), infestation (3 levels: infestation 1, infestation 2, infestation 3), and temporal block (2 levels: A and B).The full model included all the two-way and the three-way interactions between strain, sex, and infestation.Temporal block was included as a main effect only.Mouse identity was included as a random factor.Non-significant interactions (p > 0.05) were removed sequentially (Table S15).
The final simplified model included the main effects and the interactions between strain and infestation, and sex and infestation (Table S15).The interaction between strain and infestation (p = 0.0001) and the interaction between sex and infestation (p = 0.026) both had significant effect on the NSL.Strain (p = 2.2*10 -16 ) and infestation (p = 1.01*10 -11 ), but not mouse sex (p = 0.988) and temporal block (p = 0.182), had significant effects on the NSL (Table S15).
The parameter estimates for this final simplified model and their significance are shown in Table S16.Means and EMMs (using the final model and the ggemmeans() function) of the NSL were generated for each factor level (Table S17).As there were significant interactions between strain and infestation, and sex and infestation, EMMs for the combinations of these factors were also estimated (Table S17).The means from the raw data and the EMMs were similar (Table S17).The Pearson correlation between the means and the EMMs of the NSL across the 57 levels was positive and significant (Mean1 vs EMM1 in Table S17; r = 0.997, t = 102.32,df = 55, p < 2.2e-16).The units of the NSL are the number of 23S rRNA gene copies per infected unfed nymph.In what follows, we present the EMMs of the NSL and their post-hoc tests (EMM2 in Table S17).
The EMM NSL varied 8.9-fold among strains from 7665.2 units for strain 66 to 68086.7 units for strain 174.All the strains had a higher NSL compared to reference strain 66 (Table S16).The EMM NSL of infestation 1 (29631.2units) was 59.5% higher compared to infestation 2 (18580.3units; p < 0.0001) and 86.0% higher compared to infestation 3 (15934.6units; p < 0.0001).The interaction between strain and infestation indicated that the NSL changed over the three infestations for some strains, but not for others (Figure S3).
Nymphs that had fed as larvae on male mice had an EMM NSL (20936.3units) that was 3.0% higher compared to nymphs that had fed as larvae on female mice (20317.1 units), but this difference was not significant (p = 0.768).The interaction between sex and infestation refers to the fact that the difference in NSL between males and females changes over the successive infestations.For infestation 1, males have a lower EMM NSL compared to females (26270.5 vs 33421.9),but males have a higher EMM NSL compared to females for infestation 2 (19525.2vs 17681.1)and infestation 3 (17891.0vs 14192.1).Finally, the EMM NSL of larvae in block B (22017.2 units) was similar compared to block A (19319.7 units; p = 0.198).Table S15.Model simplification of the LMM of the NSL.Non-significant interactions were sequentially removed in order of increasing significance; the model was updated each time a term was removed.For each term, the order in which it was removed, the Chi-squared statistic (Chi), degrees of freedom (df), and p-value (p) are shown.For the non-significant interactions, the p-values are the ones in the model from which they were removed.For the included terms, the p-values are from the final simplified model.

Term
Order b Reference condition refers to engorged larvae from infestation 1 that fed on a male mouse in block A that was infected with B. burgdorferi strain 66.
Table S17.The means and EMMs of the NSL for each of the levels of the factors of interest.The NSL is the number of 23S rRNA gene copies per infected unfed nymph (i.e., uninfected nymphs are excluded).The means are from the raw data, whereas the EMMs were calculated using the final simplified model in Table S15.The mean or EMM of a given level was averaged over the levels of the other factors.Mean 1 and EMM1 are on the log10-transformed scale, whereas Mean 2 and EMM2 are on the original scale.For factors with significant interactions, the means and EMMs are reported for the relevant combinations of factors.LCL and UCL represent the lower and upper 95% confidence intervals reported on the same scale.Section 8 -Relationship between the mouse tissue spirochete loads and NIP Statistical Methods: We wanted to determine which mouse tissues were most strongly associated with variation in the NIP.There were 8 mouse tissues: ear biopsies that were sampled the day before the larval infestation (i.e., days 29, 59, and 89 PI) and 7 necropsies that were sampled at the end of the study (day 97 PI).Each mouse tissue had an estimate of infection prevalence (0 = uninfected, 1 = infected) and an estimate of spirochete load, which was calculated as log10[(23S rRNA gene copies/10^6 mouse Beta-actin copies) + 1].The advantage of this log[X + 1] transformation is that uninfected tissues are defined and have a value of 0. The mouse tissue spirochete loads were scaled to z-scores with mean of 0 and a variance of 1 (hereafter referred to as the z-score spirochete loads).We created 4 additional composite variables: (1) mean proportion of infected necropsies, (2) z-score spirochete load averaged over all necropsies, (3) z-score spirochete load averaged over the three skin-related tissues (ventral skin, left ear, right ear), (4) z-score spirochete load averaged over two internal organs (bladder, heart).
The 20 different mouse tissue variables are highly correlated, and a multiple regression approach was therefore not feasible.For this reason, we used AIC-based model selection to compare 20 different models; each model was a GLMM with binomial errors that analysed the NIP as a function of a different mouse tissue variable.Each model included the fixed factors of mouse sex, infestation, and temporal block, and all interactions between the mouse tissue variable, mouse sex, and infestation.As variation in NIP among strains was low in infestation 1, we expected that spirochete load (or infection prevalence) would explain more variation in NIP for infestations 2 and 3 (i.e., we expect interactions between mouse tissue variable and infestation).Mouse identity was included as a random factor.Models were ranked according to their corrected Akaike Information Criterion (AICc) where the best models have the lowest AICc scores.The difference in AICc from the top model was used to calculate the weight or support for each model.If a model has high support, it means that the mouse tissue variable is important for explaining variation in the NIP.
Results: The top 5 GLMMs of the NIP had 99.2% of the support (Table S18).These 5 models included the following 5 mouse tissue variables: heart spirochete load (30.7% support), bladder spirochete load (29.7% support), average of heart and bladder (26.7%), ear biopsy (7.5%), and heart infection prevalence (4.7%).Thus, GLMMs that included the internal organs of the heart and/or the bladder had 91.7% of the support.
GLMM of NIP versus the heart spirochete load: For the GLMM of the NIP as a function of the heart spirochete load, we simplified the model by removing non-significant terms.The simplified model contained the main effects of heart tissue spirochete load, mouse sex, infestation, and their interaction (Table S19).The heart spirochete load had a significant effect on the NIP (p < 0.001).We investigated the parameter estimates for the simplified model (Table S20).The slope of the relationship between the NIP and the heart spirochete load was positive and highly significant (slope = 0.751, SE = 0.164, p < 0.001).This result indicates that B. burgdorferi strains that establish higher spirochete loads in the heart tissues have higher hostto-tick transmission to feeding ticks (Figure S4).
For each strain, we calculated the mean heart spirochete load over the mice (up to 8 mice per strain) and the mean NIP (hereafter the lifetime NIP).The lifetime NIP for each strain was calculated by summing the number of infected nymphs and the number of tested nymphs over all the infestations and all the mice.A GLM with quasibinomial errors found a strong positive relationship between heart spirochete load and lifetime NIP across the 11 B. burgdorferi strains (slope = 0.890, SE = 0.214, t = 4.152, p = 0.002).
GLMM of NIP versus the ear biopsy spirochete load: For the GLMM of the NIP as a function of the ear biopsy spirochete load, we simplified the model by removing non-significant terms.The simplified model contained the main effects of ear biopsy tissue spirochete load, mouse sex, infestation, and the 2-way interactions between ear biopsy spirochete load and infestation and between sex and infestation (Table S21).The significant ear biopsy spirochete load: infestation interaction (p = 0.002) indicates that the slope relating NIP to ear biopsy spirochete load differs between infestations.We investigated the parameter estimates for the simplified model (Table S22).The slope of the relationship between the NIP and the ear biopsy spirochete load was negative for infestation 1 (slope = -0.790,SE = 0.278, p = 0.004), whereas it was positive for infestation 2 (contrast of slopes = 1.261,SE = 0.407, p = 0.002) and for infestation 3 (contrast of slopes = 1.011,SE = 0.300, p = 0.001).This result indicates that B. burgdorferi strains that establish higher spirochete loads in the ear tissues have higher host-totick transmission to feeding ticks for infestation 2 at day 60 PI and for infestation 3 at day 90 PI (Figure S5).
For each strain, we calculated the mean ear biopsy spirochete load over the mice and the 3 infestations (up to 24 ear biopsies per strain; 8 mice x 3 infestations per mouse) and the mean NIP (hereafter the lifetime NIP).The lifetime NIP for each strain was calculated by summing the number of infected nymphs and the number of tested nymphs over all the infestations and all the mice.A GLM with quasibinomial errors found a strong positive relationship between ear biopsy spirochete load and lifetime NIP across the 11 B. burgdorferi strains (slope = 2.357, SE = 0.467, t = 5.052, p < 0.001).

Table S18. AIC-based model selection table is shown for 20
GLMMs with binomial errors that analyzed the NIP.Each model contained a different mouse tissue variable that was either an infection prevalence or a spirochete load.Each model also included the fixed factors of mouse sex, infestation, temporal block, various interactions, and the random factor of mouse ID.For each model, shown are the model rank, mouse tissue, variable type (infection prevalence or spirochete load), model degrees of freedom (df), corrected Akaike information criterion (AICc), change in the AICc value from the top model (delta), weight of model (weight1) and cumulative weight of the models (weight2).

Model Rank Mouse tissue
Variable df AICc delta weight1 weight2 b Reference condition refers to the infection status of unfed nymphs from infestation 1 that fed as larvae on a female mouse for which the heart tissue spirochete load = 0 (which is the mean).
Table S21.Model simplification of the GLMM of the NIP as a function of ear biopsy spirochete load.Non-significant interactions were removed from the model.The final simplified model contained the covariate of ear biopsy spirochete load, the fixed factors of sex, infestation, and the ear biopsy spirochete load:infestation interaction and the sex:infestation interaction.For each term, the Chi-squared statistic (Chi), degrees of freedom (df), and p-value (p) are shown.b Reference condition refers to the infection status of unfed nymphs from infestation 1 that fed as larvae on a female mouse for which the ear biopsy tissue spirochete load = 0 (which is the mean).

Figure S4
. GLMM predicts a positive relationship between NIP and heart spirochete load.Larval infestations 1, 2, and 3 are shown in separate panels.Female and male mice are shown with pink and blue lines, respectively.The original heart spirochete loads were calculated as log10[(23S rRNA copies + 1)/10^6 Beta-actin copies] and these were converted to z-scores with a mean of 0 and a variance of 1.

Figure S5
. GLMM predicts a negative relationship between NIP and ear biopsy spirochete load for infestation 1 and a positive relationship for infestations 2 and 3. Larval infestations 1, 2, and 3 are shown in separate panels.Female and male mice are shown with pink and blue lines, respectively.The original ear biopsy spirochete loads were calculated as log10[(23S rRNA copies + 1)/10^6 Beta-actin copies] and these were converted to z-scores with a mean of 0 and a variance of 1.
For the genetic correlation matrix, the four estimates of host-to-tick transmission were all positively correlated with each other (Figure S6).Spirochete loads of the ear biopsies, heart, bladder, and kidney were positively correlated with the NIP and the NSL.Spirochete loads of the ear biopsies, heart, and bladder were positively correlated with the LIP and the LSL.The results for the phenotypic correlation matrix were very similar (Figure S7).Strains Bb16-66 and Bb16-54, which corresponded to MLSTs 237 and 741, were not detected, and were therefore assigned counts of zero.For each study, we counted the number of I. scapularis ticks infected with each of the 10 unique MLSTs of interest.To obtain the best estimate of the strain-specific frequencies in I. scapularis populations, the counts were combined for the two studies.These counts were then converted to the strain-specific frequencies.

Study by Travinsky et al (2010):
In Travinsky et al (2010), the MLST, ospC type, geographic region, and the number of ticks infected with a particular MLST and ospC type are given in columns 8, 1, 3, and 5 (denominator) of Table 1.For each of our 10 unique MLSTs, we searched for a matching MLST (without regard to ospC type) in Table 1 of Travinsky et al (2010).Of the 11 strains, 9 strains (highlighted in green in Table S23) had a matching MLST and 2 strains did not (Bb16-54 and Bb16-66; highlighted in blue in Table S23).These two 'rare' strains were assigned a count of 0 in Table S23.The number of I. scapularis ticks in which each MLST was found (N) is given in Table S23.
We wanted to obtain an estimate of the sampling effort for the northeastern USA and the midwestern USA (referred to as north-central USA in Travinsky et al (2010)).We removed all strains from northern California and Europe (n = 123 ticks), which left us with 654 ticks, of which 254, 257, and 143 ticks occurred in the northeastern USA, midwestern USA, or both regions, respectively.The 143 ticks that carry the MLSTs that occur in both regions were split among each region; the sampling effort was very similar between the northeastern USA (n = 254 + 72 = 326 ticks) and the midwestern USA (n = 257 + 71 = 328 ticks).
Of the 9 strains with a matching MLST, 7 strains (highlighted in green in Table S23) had a matching ospC type and 2 strains did not (Bb16-10-2 and Bb16-57-1; highlighted in pink in Table S23).There was also a match in geographic region between the present study and  2. For each of our 10 unique MLSTs, we searched for a matching MLST in Table 2 in Ogden et al (2011).Of the 11 strains, 7 strains (highlighted in green in Table S24) had a matching MLST and 4 strains did not (Bb16-10-2, Bb16-54, Bb16-66, and Bb16-126; highlighted in blue in Table S24).These four 'rare' strains were assigned a count of 0 in Table S24.The number of I. scapularis ticks in which each MLST was found (N) is given in Table S24.
In Ogden et al (2011), the MLSTs in the 166 ticks were classified according to whether they had already been identified in northeastern USA (n = 150 ticks), midwestern USA (n = 4 ticks), or both regions (n = 12 ticks).The 12 ticks that carry the MLSTs that occur in both regions were split among each region; the sampling effort was much higher in the northeastern USA (n = 150 + 6 = 156 ticks) compared to the midwestern USA (n = 4 + 6 = 10 ticks).
Table S24.The 11 B. burgdorferi strains in our study were matched via their MLST to the number of I. scapularis ticks carrying that strain in Canada in the study by Ogden et al (2011).For each of the 11 strains, the NML strain ID, region in Canada from which the strain was sampled, MLST, and ospC type are shown.With respect to the study by Ogden et al (2011) Estimation of the MLST-specific frequencies: Our study contained 10 unique MLSTs because strains 178-1 and 198 both had MLST 3.For each MLST, we summed the counts for the studies by Travinsky et al (2010) and by Ogden et al (2011).The total number of I. scapularis ticks in which these 10 MLSTs were detected (N1) was 253 ticks.The total number of I. scapularis ticks sampled from the northeastern and midwestern USA for the two studies combined (N2) was 820 ticks.When N1 is split among the two regions (N3), there are 144 and 109 ticks in the northeastern and midwestern USA, respectively.When N2 is split among the two regions (N4), there are 482 and 338 ticks in the northeastern and midwestern USA, respectively.
The calculation of the strain-specific frequencies is shown in Table S25.Frequency 1 (%) of each MLST was calculated by dividing the strain-specific count by N1 (n = 253) and multiplying by 100.For example, frequency 1 for MLST 3 is 100*(94/253) = 37.15%.Frequency 2 (%) of each MLST was calculated by dividing the strain-specific count by N2 (n = 820) and multiplying by 100.For example, frequency 2 for MLST 3 is 100*(94/820) = 11.46%.Frequency 3 (%) of each MLST was calculated by dividing the strain-specific count by N3 (n = 109 for midwestern USA; n = 144 for northeastern USA) and multiplying by 100.For example, frequency 3 for MLST 3 is 100*(94/144) = 65.28%.Frequency 4 (%) of each MLST was calculated by dividing the strain-specific count by N4 (n = 338 for midwestern USA; n = 482 for northeastern USA) and multiplying by 100.For example, frequency 4 for MLST 3 is 100*(94/482) = 19.50%.Frequencies 1 and 3 are calculated over the subset of the 10 MLSTs of interest, whereas frequencies 2 and 4 are calculated over all the MLSTs in the two regions of interest.Frequencies 1 and 2 use a common denominator, which does not account for potential differences in sampling effort between regions, whereas frequencies 3 and 4 account for potential differences in sampling effort between regions.
Table S25.Estimation of the frequencies of the 11 strains of B. burgdorferi in I. scapularis ticks in North America.The counts (number of ticks infected with the MLST of interest) for the 10 unique MLSTs and the other MLSTs are shown separately for the study by Travinsky et al (2010) and by Ogden et al (2011).Other MLSTs that belonged to both regions were split among the two regions as explained above.The counts were summed across the two studies (Sum).Frequencies 1, 2, 3, and 4 were calculated using 4 different denominators: N1, N2, N3, and N4.We tested whether there was a relationship between the lifetime NIP and the MLST frequencies observed in I. scapularis ticks in North America across the 11 B. burgdorferi strains.The lifetime NIP was calculated by summing the infected and uninfected nymphs over all infestations and mice for each of the 11 B. burgdorferi strains.We analyzed the frequencies that included the other MLSTs when the two regions were combined (i.e., MLST frequency 2 in Table S25), and when the two regions were adjusted for sampling effort (i.e., MLST frequency 4 in Table S25).MLST frequency was analysed using a generalized linear model (GLM) with quasibinomial errors to correct for overdispersion.Explanatory variables included the fixed factor geographic region (2 levels: northeastern USA and midwestern USA), the covariate of the strain-specific lifetime NIP, and their interaction.The model was simplified by sequential removal of the least significant terms.

Strain
For MLST frequency 2, the interaction between geographic region and lifetime NIP was not significant and was removed from the model (Chi-square = 0.002, df = 1, p = 0.962).Geographic region did not have a significant effect on MLST frequency 2 and was removed from the model (Chi-square = 0.001, df = 1, p = 0.970).Lifetime NIP had a significant effect on MLST frequency 2 (Chi-square = 10.771,df = 1, p = 0.001).The slope of the relationship between NIP and MLST frequency 2 was positive (Figure S7; slope = 9.685, SE = 3.724, t = 2.601, p = 0.029).If strain Bb16-10-2 is removed from the analysis (due to a suspected mixed infection), the relationship becomes more statistically significant (slope = 12.778, SE = 4.039, t = 3.164, p = 0.013).Strains with higher lifetime host-to-tick transmission had a higher frequency in I. scapularis populations in nature.
For MLST frequency 4, the interaction between geographic region and lifetime NIP was not significant and was removed from the model (Chi-square = 0.002, df = 1, p = 0.968).Geographic region did not have a significant effect on MLST frequency 4 and was removed from the model (Chi-square = 0.308, df = 1, p = 0.578).Lifetime NIP had a significant effect on MLST frequency 4 (Chi-square = 7.407, df = 1, p = 0.006).The slope of the relationship between NIP and MLST frequency 4 was positive (Figure S8; slope = 8.708, SE = 3.942, t = 2.209, p = 0.0545).If strain Bb16-10-2 is removed from the analysis (due to a suspected mixed infection), the relationship becomes more statistically significant (slope = 11.851,SE = 3.860, t = 3.070, p = 0.015).After adjusting for differential sampling effort between the two geographic regions, strains with higher lifetime host-to-tick transmission had a higher frequency in I. scapularis populations in nature.

Figure S7.
Our laboratory-based estimates of the lifetime host-to-tick transmission for the 11 strains of B. burgdorferi are positively related to the frequencies of these strains in natural populations of I. scapularis ticks in Canada and the USA.The strain-specific frequencies were combined both the northeastern and midwestern USA and did not adjust for differential sampling effort between regions (i.e., frequency 2 in Table S25).The slope of the logistic regression is positive and significant (slope = 9.685, SE = 3.724, t = 2.601, p = 0.029).The error bars represent the 95% confidence intervals.burgdorferi are positively related to the frequencies of these strains in natural populations of I. scapularis ticks in Canada and the USA.The strain-specific frequencies were calculated separately for the northeastern and midwestern USA and therefore adjusted for differential sampling effort between regions (i.e., frequency 4 in Table S25).The slope of the logistic regression is positive and just above the threshold of statistical significance (slope = 8.708, SE = 3.942, t = 2.209, p = 0.055).The error bars represent the 95% confidence intervals.S26.Infection status of the ear biopsies and necropsies for the 11 strains of B. burgdorferi.For each strain, the NML strain ID, geographic region from which the strain was sampled, multi-locus sequence type (MTSL), number of infected mice, number of infected ear biopsies per total number of ear biopsies tested, and the percentage of infected ear biopsies are shown.The 7 necropsy tissues include the ventral skin, left ear, right ear, heart, bladder, kidney, and rear tibiotarsal joint.For each strain and necropsy type, the number of mice for which the necropsy was infected with that strain are shown.Table S27.Mean spirochete load of the ear biopsies and necropsies for the 11 strains of B. burgdorferi.For each strain, the NML strain ID, geographic region from which the strain was sampled, multi-locus sequence type (MTSL), and number of infected mice are shown.The mouse tissues are the ear biopsies and the 7 necropsies, which include the ventral skin, left ear, right ear, heart, bladder, kidney, and rear tibiotarsal joint.For each strain the spirochete load was calculated as log10[(23S rRNA/ 10^6 Beta actin) + 1].The mouse tissue spirochete load (MTSL) refers to the mean of the subset of infected tissues.The MTSL is dominated by the heart and bladder because these tissues have higher spirochete loads than the other necropsies.The MTSL was used as the strain-specific estimate of mouse tissue spirochete load in [5].

Figure S1 .
Figure S1.The larval spirochete load (LSL) in engorged larvae increased with the tick calreticulin load.The LSL was measured as the log10 23S rRNA copies per tick.The calreticulin load was measured as the log10 calreticulin copies per tick.Each point represents an individual infected engorged larva (n = 645).The line represents the simple linear regression (slope = 0.657, r = 0.38, p = 2.2*10 -16 ).

Figure S3 .
FigureS3.The nymphal spirochete load (NSL) of B. burgdorferi from infected mice to immature I. scapularis ticks decreased over the 3 successive larval infestations for some strains but remained constant or increased over time for other strains.For this reason, the 2-way interaction between strain and infestation was significant (p = 7.561*10 -5 ).NSL was measured by infesting mice with I. scapularis larvae on 3 occasions (days 30, 60, and 90 post-infection), allowing the engorged larvae to moult into nymphs, and testing the nymphs for infection with B. burgdorferi and measuring the abundance of the B. burgdorferi 23S rRNA gene in each tick.The mean values (green circles) for each infestation are the EMMs from the LMM for the NSL of each strain over time.Error bars are the 95% confidence intervals for the EMMs.
Spirochete load in ear biopsy (z−score) Nymphal infection prevalence

Figure S8 .
Figure S8.Our laboratory-based estimates of the lifetime host-to-tick transmission for the 11 strains of B. burgdorferi are positively related to the frequencies of these strains in natural populations of I. scapularis ticks in Canada and the USA.The strain-specific frequencies were calculated separately for the northeastern and midwestern USA and therefore adjusted for differential sampling effort between regions (i.e., frequency 4 in TableS25).The slope of the logistic regression is positive and just above the threshold of statistical significance (slope = 8.708, SE = 3.942, t = 2.209, p = 0.055).The error bars represent the 95% confidence intervals.
to−tick transmission (%) Strain frequency 4 in ticks (%)Table ), temporal block (2 levels: A and B).For each parameter in the LMM, the type, description, estimate, standard error (SE), and p-value (p) are shown.
. The mean or EMM of a given level was averaged over the levels of the other factors.Lower and upper (LCL, UCL) 95% confidence limits for the EMM are reported.For factors with significant interactions, the means and EMMs are reported for the relevant combinations of factors.

Table S19 .
Model simplification of the GLMM of the NIP as a function of heart spirochete load.Non-significant interactions were removed from the model.The final simplified model contained the covariate of heart spirochete load, the fixed factors of sex, infestation, and their interaction.For each term, the Chi-squared statistic (Chi), degrees of freedom (df), and p-value (p) are shown.

Table S20 .
Parameter estimates of the GLMM of the NIP as a function of heart spirochete load.Explanatory variables include heart spirochete load, mouse sex (2 levels: male, female), infestation (3 levels: infestation 1, infestation 2, infestation 3), and their interaction.For each parameter in the GLMM, the type, description, estimate, standard error (SE), and p-value (p) are shown.

Table S22 .
Parameter estimates of the GLMM of the NIP as a function of ear biopsy spirochete load.Explanatory variables include ear biopsy spirochete load, mouse sex (2 levels: male, female), infestation (3 levels: infestation 1, infestation 2, infestation 3), and their interactions.For each parameter in the GLMM, the type, description, estimate, standard error (SE), and pvalue (p) are shown.

burgdorferi strains in I. scapularis ticks in North America:
[19,20] correlation matrix for the 12 variables across the 11 strains of B. burgdorferi.The 12 variables include spirochete loads in 8 mouse tissues: (1) ear biopsy, (2) ventral skin, (3) left ear, (4) right ear, (5) heart, (6) bladder, (7) kidney, (8) joint, and four estimates of host-totick transmission of B. burgdorferi: (9) larval infection prevalence (LIP),(10)larval spirochete load (LSL),(11)nymphal infection prevalence (NIP), and (12) nymphal spirochete load (NSL).The size and color of the circle indicate the absolute value and direction of the Pearson correlation coefficient.Non-significant pairwise correlations are indicated with an 'X'.(11)nymphal infection prevalence (NIP), and (12) nymphal spirochete load (NSL).The size and color of the circle indicate the absolute value and direction of the Pearson correlation coefficient.Non-significant pairwise correlations are indicated with an 'X'.Section 10 -Estimation of MLST frequencies in I. scapularis ticks in North America.Two previous studies determined the frequencies of B. burgdorferi MLSTs in questing I. scapularis ticks collected in the USA and Canada[19,20].The number of I. scapularis ticks infected with each MLST are reported in Table1in Travinsky et al (2010) and Table 2 in Ogden et al (2011).After removing the MLSTs and ticks from northern California and western Europe, the study by Travinsky et al (2010) identified 35 different MLSTs in 654 I. scapularis ticks.The study by Ogden et al (2011) identified 40 different MLSTs in 166 I. scapularis ticks.These two studies combined contained 9 of the 11 strains and 8 of the 10 unique MLSTs in the present study.

Table S23 .
Travinsky et al (2010).All 5 strains that were obtained from Nova Scotia (East Canada) occurred in the northeastern United States (Region 1) inTravinsky et al (2010).Of the 6 strains that were sampled from Manitoba and western Ontario (Midwest Canada), 4 occurred in the north-central United States (Region 2) inTravinsky et al (2010).The 11 B. burgdorferi strains in our study were matched via their MLST to the number of I. scapularis ticks carrying that strain in the USA in the study byTravinsky et al (2010).For each of the 11 strains, the NML strain ID, region in Canada from which the strain was sampled, MLST, and ospC type are shown.With respect to the study by Travinksy et al (2010), the MLST, ospC type, geographic region, and the number of I. scapularis ticks in which that combination of MLST and ospC type was found (N) are shown.

Study by Ogden et al (2011): In
Ogden et al (2011), the MLST, count, and geographic region are given in columns 1, 2, and 3 in Table , the MLST, geographic region, and the number of I. scapularis ticks in which that MLST was found (N) are shown.
Ogden et al (2011)011), NE and MW refer to the fact that these MLSTs had already been identified in the northeastern United States and the midwestern United States, respectively.
Relationship between laboratory estimates of lifetime NIP and frequencies observed in I. scapularis ticks across 11 B. burgdorferi strains.

Table S28 .
Region MLST Mice Biopsy Ventral Ear left Ear right Heart Bladder Kidney Joint MTSL Transmission variables for the 11 strains of B. burgdorferi.The 11 B. burgdorferi strains, the geographic region from which the strains were sampled, and their multi-locus sequence types (MLST) are shown.For each strain, the number of infected mice, number of infected larvae per total larvae tested, larval infection prevalence (LIP), larval spirochete load 1 (LSL) larval spirochete load 2 (LSL2), nymphal infection prevalence (NIP), nymphal spirochete load 1 (NSL) and nymphal spirochete load 2 (NSL2) are shown.The LSL and NSL are calculated as log10(23S rRNA copies + 1) and includes the uninfected ticks, which have a value of 0. The LSL2 and NSL2 are calculated as log10(23S rRNA copies) and excludes the uninfected ticks.