Pre-Columbian Origins for North American Anthrax

Disease introduction into the New World during colonial expansion is well documented and had a major impact on indigenous populations; however, few diseases have been associated with early human migrations into North America. During the late Pleistocene epoch, Asia and North America were joined by the Beringian Steppe ecosystem which allowed animals and humans to freely cross what would become a water barrier in the Holocene. Anthrax has clearly been shown to be dispersed by human commerce and trade in animal products contaminated with Bacillus anthracis spores. Humans appear to have brought B. anthracis to this area from Asia and then moved it further south as an ice-free corridor opened in central Canada ∼13,000 ybp. In this study, we have defined the evolutionary history of Western North American (WNA) anthrax using 2,850 single nucleotide polymorphisms (SNPs) and 285 geographically diverse B. anthracis isolates. Phylogeography of the major WNA B. anthracis clone reveals ancestral populations in northern Canada with progressively derived populations to the south; the most recent ancestor of this clonal lineage is in Eurasia. Our phylogeographic patterns are consistent with B. anthracis arriving with humans via the Bering Land Bridge. This northern-origin hypothesis is highly consistent with our phylogeographic patterns and rates of SNP accumulation observed in current day B. anthracis isolates. Continent-wide dispersal of WNA B. anthracis likely required movement by later European colonizers, but the continent's first inhabitants may have seeded the initial North American populations.


Introduction
The basic premises of disease tracking have changed little since John Snow first described the London cholera epidemic of 1854. The use of molecular genotyping technologies has allowed the epidemiological linkage of geographically disparate isolates, generating hypotheses about patterns and modes of disease dispersal. As might be expected, the distribution of human pathogens that cause persistent infections such as Helicobacter pylori, the Typhi serovar of Salmonella enterica, Mycobacterium tuberculosis and Polyomavirus JC reflect both recent and ancient human migratory patterns [1][2][3][4][5][6][7]. Conversely, pathogens that cause acute infections remain only briefly within a host and are therefore less likely to follow long term host distribution patterns [3]. The dispersal of Bacillus anthracis, Yersinia pestis, and human RNA viruses often reflect short term human movement frequently associated with trading contaminated animal products or inadvertently transporting primary vectors or hosts [1,3,[8][9][10][11][12]. Such potentially frequent and long range dispersal of pathogens can obscure more ancient phylogeographic patterns. The history of B. anthracis in North America has certainly been affected by recent trade, and livestock movement [13], however here we present evidence that the introduction of this pathogen can be traced to much more ancient human migrations. We believe this to be an example of an opportunistic human pathogen reflecting ancient human dispersal patterns.
The recent and dramatic increase in the ability for extensive genomic sampling through whole genome sequencing coupled with extensive strain collections should enhance our ability to reconstruct even ancient epidemiological events. The strictly clonal reproductive patterns and low polymorphism frequency of evolutionarily stable molecular markers in B. anthracis makes this a model organism for tracking ancient epidemiological patterns. Whole genome sequencing of multiple B. anthracis strains has led to the construction of a highly accurate phylogenetic backbone [14] based upon an expansive world-wide strain collection [13]. Whereas it would be advantageous to sequence all available isolates within a lineage, this approach is still prohibitively expensive and requires SNP detection by whole genome comparisons. Targeting characters and taxa within specific lineages can further enhance the detection of evolutionary patterns which, when combined with sample spatial data, enables precise epidemiological tracking of disease, even for pre-historical events.
B. anthracis has dispersed globally via large and sequential radiations associated with human commerce and trade of animal products contaminated with B. anthracis spores [13]. Without human involvement, infected animals typically die within 7-10 days, seeding only the surrounding soil with spores thus keeping the spread of the disease relatively contained [15]. The potential for dispersion even among migratory herds is limited since infected animals typically die quickly before extensive dispersal can occur. Historically, an animal that died of anthrax was scavenged by people for its hair, hide, bones, and even consumed as food, facilitating the dispersal of spores away from a carcass. Indeed, imported spore-contaminated animal hides account for many of the recent US human anthrax cases [16], though such modern cases infrequently result in subsequent ecological establishment or further dispersal. Therefore, with the exception of the most recent human cases, the current distribution of B. anthracis can be traced to historical human dispersion, trade, and migratory patterns.
The most dramatic dispersal and clonal expansion of B. anthracis was the A-radiation [13], which is phylogenetically rooted in the Old World. Nested within the A-radiation is the highly successful trans-Eurasian subpopulation (TEA). Its prevalence in Europe and Asia is thought to be mediated by the east-west human trade routes, such as the ''Silk Road''. One sublineage of this TEA population, western North America (WNA), was introduced into North America and has become highly successful within this geographic region. The WNA sublineage is dominant today in central Canada and much of the western United States.
In North America, two distinct types of anthrax cases are seen. Many have been observed along the East Coast and are associated with trade and industrial processing of contaminated animal products, often wool in textile mills [17][18][19]. These cases contribute to the overall genetic diversity of North American B. anthracis isolates, but generally represent small case clusters that do not become ecologically founded. This lack of establishment could be due to a requirement of suitable habitat for natural disease cycling [20]. In contrast, western North American grasslands are ideal for the ecological establishment of anthrax and may have persisted for much of the Holocene epoch, possibly over 10,000 years [21].
Indeed, at least two B. anthracis clades are ecologically established in North America on a sub-continental scale [13]. The ''Ames'' clade (A.Br.Ames) has been associated with highly localized and sporadic outbreaks in south Texas since at least the early 1980s [22]. Only a short evolutionary time period, very few SNPs, separate Texas Ames isolates from Asian near relatives, suggesting a recent or perhaps colonial animal importation. In contrast, the WNA clade has been widely successful both in distribution and frequency across central and northerly North American regions and is clearly ecologically established in many geographic areas. WNA isolates have been recovered from near the Artic Circle in Canada to the U.S. Mexican border and even in insular Haiti, and account for 89% of non-human anthrax cases in North America. The WNA clade also exhibits greater genetic diversity than the Ames clade, and a longer evolutionary separation (106 SNPs) from its nearest Old World relatives (TEA), suggesting a more ancient introduction into North America. The ecological dominance and disease importance of the WNA clade led us to examine its evolutionary history in greater detail using whole genome sequence analysis and highly accurate phylogenetic reconstructions.

SNP Analyses
Whole genome sequencing of seven diverse strains led to the discovery of 2,850 SNPs suitable for conversion to whole genome tiling microarrays. One of these seven strains was the WNA strain (A0193) [14]. These SNPs were screened among 128 diverse isolates (described by MLVA15 analyses) and identified WNA as a monophyletic group rooted in the Old World TEA group. These data also showed the WNA group to be separated by a long phylogenetic branch (106 SNPs) ( Fig. 1) representing 53% of the total distance from the initiation of the A branch radiation to the sequenced WNA isolate. Ten of these 106 SNPs were developed into Real-time PCR assays and used to screen all 387 isolates in the study. These SNPs identified six sub-clades within the previously described WNA lineage.

Phylogenetic analyses
SNP characters are rare in the B. anthracis genome and almost never have character state reversals (,0.1% homoplasy) [14]. Using a cladistic evolutionary model, there was one character-state inconsistency in these data (,1% homoplasy). Given such robust data, all approaches (e.g., MP, NJ) to construct phylogenetic trees generate the same evolutionary hypothesis (Fig. 1). Including any B. anthracis strain from outside this lineage allows us to accurately root this lineage using standard outgroup rooting methods.

Geographical mapping of WNA isolates
The six sub-clades identified by SNP analyses revealed a northsouth phylogeographic pattern ( Fig. 1), with short terminal branches indicating rapid radiation of the WNA group following the initial establishment in northern Canada. Additionally, there were correlations among the nodes identified with recovery from their respective hosts. For example, the yellow group was mostly associated with wildlife (n = 63) whereas the other groups were almost exclusively from livestock.

Discussion
SNPs are stable phylogenetic characters in B. anthracis and impart a highly accurate evolutionary hypothesis both in terms of branch lengths and branching order [14]. Our phylogenetic topologies are highly suggestive of an initial introduction of B. anthracis into the far north of North America and subsequent southerly dispersal (Fig. 1). The ancestral WNA phylogenetic nodes are northernmost, followed by progressively more southern locations for the more recently derived clades. More recent populations in southern states and a single human isolate from Haiti (not shown) are probably the result of recent commodity trading or infected livestock transport. In contrast, the northernmost and more ancient populations of WNA B. anthracis remain relatively localized, possibly due to association with bison (Bison bison) and restricted human commerce in this region. SNPestimated evolutionary branch lengths provide additional support for a pre-Columbian introduction of WNA anthrax to North America. The accumulation of up to 106 SNPs since the split of the WNA lineage from the TEA lineage is a relatively large number for B. anthracis clades [14]. Interestingly, this indicates a long evolutionary separation followed by a genetic bottleneck and a single founding event in North America. In contrast, the Texas ecologically-established Ames lineage accumulated only ,8 SNPs since introduction from Asia [23]. Molecular clock calibration is always problematic, but this .10-fold difference offers dramatic contrast between a recent and ancient New World introduction. Moreover, divergence times within branches of the A-radiation may be greatly affected by the number of generations in natural disease cycling. The model introduced in Van Ert et al (2007) was based upon 1 and 0.5 generations per year (approximately 3500 to 7000 ybp respectively for the major A-radiation). However, a recent overview of early outbreaks in Northern Canada provides greater insights into natural disease cycling in this region. Dragon and Elkin noted [24] that there were eight sporadic outbreaks between 1962 and 1991, and since 1993, there have been five more. Using these empirical data we can estimate 0.28 generations per year (13 outbreaks in 46 years). This empirical based estimation of this parameter significantly increases divergence times within the TEA-WNA lineage and suggests that divergence times provided in Van Ert et al. (2007) were underestimates.
Human migrations are the most likely source for the introduction and establishment of the WNA lineage of anthrax in North America through an ice-free corridor that connected Beringia to the southern areas east of the Rocky Mountains. Contiguous grassland habitat created by the partial retreat of the Laurentide and Cordilleran ice sheets would have been ideal for anthrax susceptible grazing herd animals. During the late Pleistocene epoch, Asia and North America were joined by the Beringian Steppe ecosystem [25]. This grassland refugium allowed animals and humans to freely cross what would become a water barrier (Bering Sea) in the Holocene. Humans could have transported B. anthracis to this land bridge from Asia and then moved it further south as the ice-free corridor and developing grassland opened in central Canada ,13,000 ybp. Bison and other potential herding hosts were already widespread in North America, yet the limited range of the most ancient northern WNA B. anthracis populations suggest that anthrax was not present at this time south of the bottleneck created by the coalescing glaciers. Furthermore, as the ice-free corridor expanded, there was a simultaneous northward movement of these grazers like bison, yet the phylogenetic directionality of anthrax spread is southerly. Humans, however, did move southward through this corridor and could have brought contaminated animal products with them, which eventually enabled the ecological introduction of WNA B. anthracis in North America.
This northern-origin hypothesis is highly consistent with phylogeographic patterns and rates of SNP accumulation observed in recent B. anthracis isolates. While isolates from many different lineages have been observed in North America, their presence can be attributed to post colonial trade. Continent-wide dispersal of WNA B. anthracis may have involved later European colonizers, but some of the first inhabitants of the continent likely seeded the initial North American populations.

Strains
The 387 strains (Table 1) used in this study were obtained from various sources and linked to both livestock and wildlife outbreaks in North America. Four of these isolates were associated with human infections. Of the 387 strains, 352 cluster with the major ecologically established Western North American clade (WNA:        A.Br.WNA) whereas 17 isolates were included to represent the TEA population as an outgroup. Genotypic data for each strain is provided in Table 2.

SNP Discovery
Seven diverse B. anthracis strains, including one from the WNA clade, which had previously been characterized by Multi-Locus Variable Number of Tandem Repeats Analyses (MLVA), were further characterized by shotgun cloning and whole genome sequencing by the Sanger method. These efforts lead to the discovery of ,3,000 SNPs within the B. anthracis genome.         Construction of a whole genome tiling microarray A custom Affymetrix genotyping microarray was constructed using 2850 SNPs. We genotyped 128 diverse strains using this format; relevant to this study, 10 strains were from the trans-Eurasian population (TEA: A.Br.008/009) and 21 were from the Western North American clade (WNA: A.Br.WNA). Of the 2,850 SNPs, 78 separated TEA from WNA, while 28 split the WNA clade into 6 genotypes (Fig. 1).

Real-Time PCR analyses
We selected 10 SNPs for conversion into TaqmanH MGB dual-probe real-time PCR SNP assays (Table 3). Assays were Table 3. Primers and Probes used to detect WNA clade specific SNPs. The locus name correlates to the position on the whole genome sequence of the Ancestral Ames strain (AE017334). 2 F refers to the Forward primer whereas R refers to the Reverse primer. 3 Allele states are designated in red. VIC is the fluor conjugated to the 59 end of the probe for one allele whereas FAM is the fluor conjugated to the 59end of the alternate allele.
Genotyping data are presented as the SNP state at the particular locus. Each locus is presented with reference to the genome position in the Ancestral Ames strain (GenBank ID: AE017334).  designed using Primer Express (Applied Biosystems, Foster City, CA). This is a highly sensitive technology that is fast and costeffective when analyzing hundreds of samples. Table 4 describes the associated phylogenetic groups for each SNP: Branch separating TEA from WNA (4 SNPs) and WNA subtypes (6 SNPs). These rapid PCR assays were used to genotype a total of 352 isolates from the WNA clade ( Table 2). Most of the WNA members had been previously identified using the canonical SNP assay for A.Br.009 [13].

Spatial Analysis
Spatial data were then linked to genotypic analysis for each isolate and plotted with a Geographic Information System (ArcView 3.3). Some isolates were retrieved from archival collections and were not associated with geographic information, hence only 285 isolates were spatially mapped.

Phylogenetic analysis
Phylogenetic analysis using a cladistic approach was accomplished with PAUP 4.0 [26].