Bayesian Space-Time Patterns and Climatic Determinants of Bovine Anaplasmosis

The space-time pattern and environmental drivers (land cover, climate) of bovine anaplasmosis in the Midwestern state of Kansas was retrospectively evaluated using Bayesian hierarchical spatio-temporal models and publicly available, remotely-sensed environmental covariate information. Cases of bovine anaplasmosis positively diagnosed at Kansas State Veterinary Diagnostic Laboratory (n = 478) between years 2005–2013 were used to construct the models, which included random effects for space, time and space-time interaction effects with defined priors, and fixed-effect covariates selected a priori using an univariate screening procedure. The Bayesian posterior median and 95% credible intervals for the space-time interaction term in the best-fitting covariate model indicated a steady progression of bovine anaplasmosis over time and geographic area in the state. Posterior median estimates and 95% credible intervals derived for covariates in the final covariate model indicated land surface temperature (minimum), relative humidity and diurnal temperature range to be important risk factors for bovine anaplasmosis in the study. The model performance measured using the Area Under the Curve (AUC) value indicated a good performance for the covariate model (> 0.7). The relevance of climatological factors for bovine anaplasmosis is discussed.


Introduction
Bovine anaplasmosis, caused by an obligate intercellular bacterium, Anaplasma marginale (Family: Anaplasmataceae, Order: Rickettsiales) affects beef-cattle and dairy production in almost all the states in the US, causing significant economic losses to producers. The control of this disease currently relies predominantly on infection-avoidance alone since fully licensed vaccines are not marketed in North America (see also, http://www.anaplasmosisvaccine.com/). The bacterium is known to cause a hemolytic disease in cattle, which manifests as anemia, abortion, icterus, lethargy, and causes death primarily in older animals [1]. Cattle that survive infection are persistent carriers of the pathogen and are a source of infection for other animals through inadvertent mechanical transmission via blood-contaminated multi-use needles and surgical equipment and as well as via tick transmission. In North America, A. marginale is biologically vectored by different hard tick species in the Genus Dermacentor [2], [3] but other arthropod vectors that could aid in the transmission of this disease may also include ticks in the Rhipicephalus Genus and biting flies [3].
Illnesses caused by tick-borne pathogens to animals and as well as humans in general have increased over the past years in the Midwestern US [4] [5], including in the state of Kansas where tick-borne diseases have expanded to newer areas over the years that have traditionally not witnessed such diseases in the past [6], [7]. Some of the increase in the space-time expansion of tick-borne diseases in the Midwestern region may be attributed to geographic expansion of tick populations [8][9][10]. Although a plethora of anecdotal and published evidence suggest the increasing menace of bovine anaplasmosis in newer areas, quantitative space-time evaluation of whether or not bovine anaplasmosis has spread to previously unreported areas over time is not readily available. Likewise, information on any potential environmental and climatological drivers behind the space-time expansion of bovine anaplasmosis cannot be easily found, which has disease management implications.
Space-time disease mapping models are a popular tool to describe disease patterns and to identify unusual clusters of incidence in space and time-trends or both. Bayesian hierarchical models with different parametric or non-parametric time-trend and space-time interactions have advantages over frequentist approaches for analyzing datasets with inherent space-time dependency [11], [12]. Such models help detect any localized clusters that may be linked in time, for instance due to a set of favorable environmental drivers or cattle movement. Another way to strengthen inference from Bayesian space-time models is by including relevant ecological covariates that often explain additional variability in aggregated incidence datasets. This is particularly relevant in the case of tick-borne disease incidences since the spatial distribution of ticks are largely determined by physical environment and climatological conditions.
The objective of the study reported here was to retrospectively evaluate the space-time patterns and the environmental drivers of bovine anaplasmosis incidence in the Midwestern state of Kansas using Bayesian hierarchical modeling approach. Cases used in the study were diagnosed at Kansas State Veterinary Diagnostic Laboratory (KSVDL) between the years 2005-2013.

Anaplasmosis data
Positive test records for bovine anaplasmosis were searched through the Laboratory Information Management System (LIMS) at Kansas State Veterinary Diagnostic Laboratory (KSVDL), and summarized to their respective counties. Diagnostic test results that indicated a positive diagnosis for anaplasmosis in one or more of the following tests, including blood smear, ELISA, or polymerase chain reaction test were considered as confirmed cases of bovine anaplasmosis.

Covariate data
Covariates representing the physical environment were derived from the National Land Cover Dataset (NLCD) [13], and climatological data were derived from USGS and NASA resources ( Table 1). The environmental covariates represented percentages of different land classes for each county in the state of Kansas, and as well as two variables representing landscape metrics indicative of landscape fragmentation viz., total edge contrast index and patch density. The total edge contrast index was calculated in FRAGSTATS [14] program by where e ik is the total length of edge between patch types i and k, and E Ã is the total length of edge in landscape, and d ik is the dissimilarity (edge contrast weight) between patches i and k. Patch density was estimated in FRAGSTATS program by where N = total number of patches in the landscape and A = total landscape area (m 2 ). Climatic variables including the maximum normalized vegetation index (NDVI), minimum land surface temperature LST (min) , mean LST (mean) , diurnal temperature range (DTR) (the difference between daily maximum and minimum temperatures averaged over a thirty day period), precipitation and humidity were extracted for each county in the study area. The LST and NDVI estimates were derived from MODIS (Moderate Resolution Imaging Spectroradiometer) imagery [15]. DTR, precipitation and relative humidity were derived from the Prediction of Worldwide Renewable Energy (POWER) web portal of the NASA Langley Research Center [16] [17]. All climatological data were downloaded for a period roughly corresponding to high tick activity season in the region (March through August) and averaged to derive representative values.
To account for reservoir host effect in the models, the number of deer per Deer Management Unit (DMU) was obtained for the study region from the Kansas Department of Wildlife, Parks and Tourism (KDWPT). Since the DMUs were at a coarser scale than counties, deer numbers for all the counties that were completely present within a given DMU and those whose 50% or more land area was present within a DMU was assigned the same value, then divided by the area of the respective counties to obtain density values.

Statistical analyses and model specification
Let Y ij be the observed number of infected cattle among N ij individual cattle at risk within a population in county i, diagnosed positive in year j. We modelled Y ij to follow a binomial approximation Y ij * bin(θ ij ), where θ ij is the expected number of cattle that are at risk for anaplasmosis in county i in year j. The probability of detecting anaplasmosis θ ij is given by where, X ij ¼ ðX ð1Þ ij ; X ð2Þ ij ; . . . ; X ðPÞ ij Þ T is the vector of P associated with environmental predictors k observed at county i. Univariate regression models were run to identify physical environmental and climatic factors significantly associated with anaplasmosis risk. Candidate explanatory variables to be included in the Bayesian hierarchical models were screened a priori in order to avoid model fitting issues. Several Frequentist bivariate regression models evaluated each variable independently and variables that were significant at p < 0.2 were kept. Care was taken not to remove candidate variables that were deemed clinically relevant. Multicollinearity among screened variables was tested by estimating the variance inflation factor (VIF) and all variables with a VIF ! 10 were considered to indicate multicollinearity, in which case, one of the variables was dropped at a time until multicollinearity was absent. Non-linearity among independent variables with the response was evaluated at the screening stage with univariate regressions, and when non-linear variables were present, they were categorized using cutoffs based on scatterplots.
Bayesian geostatistical models with county-specific random effects were fitted to estimate the degree of spatial autocorrelation in anaplasmosis risk and to assess the effect of different covariates. For the process models, we used a logit link function in an extended generalized linear model (GLM) structure that incorporated stochastic spatial and temporal functions and as well as different covariate effects. Several models that allowed us to evaluate random and covariate effects on anaplasmosis prevalence were fitted individually. First, we modelled the spatial component we adopted the standard Besag et al (1991) [18] model with a spatially unstructured and structured u i ,v i components.
Where, β 0 (intercept) represents the mean prevalence of anaplasmosis in all counties in all years, and u i ,v i are a random terms accounting for spatially unstructured variation in anaplasmosis prevalence and unstructured heterogeneity, respectively. No interaction was assumed to exist between u i and v i were assigned u i * CAR, v i $ Normalð0; s 2 v Þ priors. Spatial dependence in u i was applied by assuming a conditional autoregressive model (CAR)(γ) with a Gaussian distribution [19], which implies that each u i is conditional on the neighbor u j with variance s 2 i dependent on the number of neighboring counties n i of county i, i.e., In a second model we introduced a γ j term to account for the temporal component of the data. This term was assigned a random walk (RW1) prior g j $ Nðg jÀ1 ; t À1 g Þ: [20] Following this in a third model, in order to detect potential space-time interaction effects in anaplasmosis prevalence, a random term ψ ij was introduced, with ψ ij * N(ψ i,j−1 τ ψ ) prior [21]. The model was notated as, For the covariate model, different covariates were included to space-time model in several steps, starting with a model that included all covariates followed by removal of one covariate at each step. Covariates were retained in the model unless their removal resulted in the increase of Deviance Information Criterion (DIC) [22] value by 5 units or more. Interaction effects between covariates were included in these steps as well, previously removed covariates did not enter the final model.
The model posterior parameters were estimated using a Bayesian framework implemented using R-INLA software on a Linux Beocat cluster computing environment. Distributions of covariate effects on anaplasmosis prevalence are seldom available for the region; therefore noninformative, uniform priors were selected for the regression parameters, β k and their variance components, s 2 k : This allowed the observed data to have the greatest influence on posterior distributions without being constrained by the choice of a prior. The median estimates from the posterior Bayes distribution and their 95% Credible Intervals (CrI) were calculated.
Models were validated by randomly partitioning the county-level expected risk estimates into five subsets and by running the models using only four of the five subsets, while validating model prediction with the fifth subset. The models were run for five times to allow each validation with subset. Each time, the model's performance (prediction accuracy) was measured using area under the receiver-operator's curve (AUC) [23] values with the observed prevalence (dichotomized as 0,! 0). The mean error and mean absolute error were calculated to quantify prediction bias and overall precision respectively.

Results
There were n = 478 cases of bovine anaplasmosis diagnosed positive at KSVDL between years 2005-2013, which were distributed predominantly in counties to the east central and some western parts in Kansas (Fig 1). There was a large variability in the average annual number of cases aggregated at the respective counties. No tests were submitted from many counties throughout the study period (Fig 1), and for others the annual averages ranged from 4 to 12 between 2005-2013.
Of the three progressively additive partial space-time models evaluated in the study, model-3 had the lowest DIC value, indicating the presence of a time trend and space-time interaction effect in the bovine anaplasmosis dataset. In the ensuing steps, different covariates were added to this model (model-3) alone. Of all the covariates evaluated in the study, six retained significance in the univariate screening with a liberal p 0.2 level (Table 2). Multicollinearity and nonlinearity among the six selected candidate variables were absent.
The partial space-time model with lowest DIC value (model-3) and the best fitting covariate model (model-7) and had similar DIC values (Table 3); although, the covariate model had the lowest DIC value among all models and all interpretations were made based on this model alone. The best fitting covariate model indicated a significant spatiotemporal effect for all years, and three climatological covariates, minimum land surface temperature (henceforth LST (min) ), relative humidity and diurnal temperature range (henceforth DTR) were retained as significant determinants of bovine anaplasmosis in the study region ( Table 4). The odds ratios and 95% CrI for different covariates in the models are present in Table 4, and they indicate that those counties with higher annual averages of LST (min) , relative humidity and diurnal temperature significantly increased the odds of positive diagnosis for anaplasmosis in cattle. No physical environment variable were retained in the final covariate model.
The observed county-level aggregates of positive bovine anaplasmosis cases per county, per year is present in Fig 1, and the posterior estimates based on the final covariate model, which  also included random terms for space, time and space-time effects is present in Fig 2. The space-time parameter estimates and their 95% credible intervals throughout the study period indicated a steady progression of bovine anaplasmosis for the study region over the years with the exception of the years following 2011, and a slight decline for years 2012 and 2013 (Fig 3). However, the trend observed for the later part of the study period may be an artifact of the inability of the covariate model to capture all the variability in the dataset for these years and not an actual decline in the space-time trend. A comparison of model precision measured by the area under receiver-operator curve (AUC) for the partial space-time model (model-3) and the covariate model (model-7) is present in Table 5, which indicated a good discriminative capacity for the covariate model, and the mean error and mean absolute error indicated that covariate model performed relatively better than other models considered (Table 5).
To highlight the counties having the most substantial space-time interactions, we plotted the posterior probability of ψ ij > 1 is above 0.8, and the results are plotted in Fig 4. The interaction terms are predominant in the central and south eastern counties of Kansas, indicating that common risk factors are coming to play in these areas for bovine anaplasmosis.

Discussion
This study has identified a significant space-time progression of bovine anaplasmosis in the central Midwestern state of Kansas along with important climatological drivers of the spacetime progression. The posterior Bayesian estimates derived in this study for the covariate model were based on individual numbers of tests that indicated positive diagnosis for bovine anaplasmosis cases submitted to KSVDL. However, tests for anaplasmosis are often referred to laboratories based on blood drawn from small groups of cattle within a herd, and it is highly likely for other animals in the herd to be infected as well. Using infected herd numbers per county, per year instead of individual tests on cattle numbers would yield more accurate posterior estimates. Despite our best attempts, data pertaining to herd sizes and the number of herds per county could not be obtained for this study. Notwithstanding this limitation the covariate model performed satisfactorily in predicting infection occurrences as indicated by the model performance statistics (Table 4), and indicate a steady spatio-temporal progression of bovine anaplasmosis in the state of Kansas over the past several years (Fig 3). A plot of counties with high space-time interaction in the covariate model consistently indicated the presence of a cluster of counties in the central and south central portion of the state (Fig 4), suggestive of common risk factors in this area for cattle infection with bovine anaplasmosis. The cluster of counties identified in the present study comprises an area in Kansas that is known for its higher tick spp. density; however, the prevalence of A. marginale among Dermacentor ticks or other potential tick species from this area is not clearly known. Studies to quantify prevalence levels of the bacterium in the cluster of counties identified here may yield useful management strategies. In addition to the role of ticks, the cluster identified here may also indicate the routine import of infected cattle from other states into these counties, and as well as prevalent management practices that could inadvertently sustain transmission of A. margniale to naïve cattle herds. Future epidemiological studies on bovine anaplasmosis in the state may benefit by considering these factors. Widening spatial distribution of bovine anaplasmosis and the potential for increased intensity of this disease due to ongoing climate-change has been speculated in the past (e.g., Kocan et al., 2010, Jonsson andReid, 2000) [3], [24]. The identification of climatological factors as important drivers of the noted space-time progression and as risk factors for bovine anaplasmosis in this study further supports this suggestion. Climatic conditions considered typical for the Central Plains region in the US have already been noted to have changed in noticeable ways due to climate-change [25], [26], and many such conditions are known to affect tick phenology and their spatial distribution either directly or indirectly. The identification of associations between climate-change indices and infectious diseases is often the first step in understanding climate-change impacts on infectious diseases. However, detecting such associations are often problematic, with some of the contributing factors to this problem being presence of multiple influential but also confounding factors for infectious diseases, variability in infection data due to host related factors, coarse resolution of disease and climate data used for detection of such associations. Climate covariates were more important than the physical environment in the present study, and all three climatological covariates, minimum land surface temperature, relative humidity and diurnal temperature range (the difference between daily minimum and maximum temperatures) are considered as indices of climate-change and are intensely monitored by the climate-change research community [27], [28].
Land surface temperature and diurnal temperature are relevant to tick survival, spatial distribution and potentially also their ability to successfully harbor and transmit pathogens. The LST min identified in the study likely indicates the average minimum temperature in some counties as a limiting factor for ticks to survive through winters but not in others. The diurnal temperature range indicates the warming night time temperatures that may favor tick survival in their current ecological niche and as well as expansion to newer areas where conditions are becoming suitable. Diurnal temperature range has been decreasing worldwide since the 1950s due to increasing daily minimum temperature (T min ) at a faster rate than the daily maximum temperature (T max ), as well as (T min ) decreasing at a slower rate than (T max ). For most parts of the US, trends show that (T max ) have remained constant or have increased only slightly but (T min ) have increased at a faster rate [27] [29].
Relative humidity has been associated with the prevalence and distribution of other tickborne diseases in North America [30][31][32] and it is an important delimiter to the survival and spatial distribution of ticks [33] [34]. There are large variations in the yearly precipitation received across the state of Kansas, with eastern Kansas receiving up to three times more rainfall than west [35]. As a result, climate and vegetation are transitional between the humid east and semi-arid western portion of Kansas that may explain the noted geographic pattern for bovine anaplasmosis in the present study. Humidity can often be seen associated with the survival and abundance of ticks in the literature, with higher humidity conditions often favoring the long-term survival of some ticks species' life stages through dry seasons [36], [37] among other reasons. Of the three climatological covariates, relative humidity had the largest influence on the covariate model based on the posterior estimate and credible intervals. We are tempted therefore to suggest that the relatively higher relative humidity conditions in the eastern and southeastern parts of the state that favor the survival and proliferation of A. marginale transmitting tick vectors is by far the most important eco-climatic driver for bovine anaplasmosis in the region. Further field based and laboratory studies are essential to further strengthen this association.

Conclusion
The study results indicate that cases of bovine anaplasmosis in Kansas has steadily increased between years 2005-2013 and to newer geographic areas during the same period. The presence of higher space-time interaction for bovine anaplasmosis infection within a cluster of central and south central counties indicate the influence of similar risk factors, and are potential areas for targeting prevention/management efforts. Three climate change indices, minimum land surface temperature, diurnal temperature range and relative humidity are drivers of space-time pattern for bovine anaplasmosis in Kansas at a county-scale. This finding is significant in a climate-change implications on infectious diseases context. Two immediate questions we are led to ask based on this finding are how the associations of these factors might be further quantified under field and laboratory conditions for the tick host and the pathogen? And, how these factors might influence the further geographic expansion of Dermacentor ticks and A. marginale under different climate-change scenarios.