Satellite Hyperspectral Imagery to Support Tick-Borne Infectious Diseases Surveillance

This study proposed the use of satellite hyperspectral imagery to support tick-borne infectious diseases surveillance based on monitoring the variation in amplifier hosts food sources. To verify this strategy, we used the data of the human rickettsiosis occurrences in southeastern Brazil, region in which the emergence of this disease is associated with the rising capybara population. Spatio-temporal analysis based on Monte Carlo simulations was used to identify risk areas of human rickettsiosis and hyperspectral moderate-resolution imagery was used to identify the increment and expansion of sugarcane crops, main food source of capybaras. In general, a pixel abundance associated with increment of sugarcane crops was detected in risk areas of human rickettsiosis. Thus, the hypothesis that there is a spatio-temporal relationship between the occurrence of human rickettsiosis and the sugarcane crops increment was verified. Therefore, due to the difficulty of monitoring locally the distribution of infectious agents, vectors and animal host’s, satellite hyperspectral imagery can be used as a complementary tool for the surveillance of tick-borne infectious diseases and potentially of other vector-borne diseases.


Introduction
Active disease surveillance, which involves searching for evidence of disease through routine and monitoring in endemic areas, could help prevent an outbreak, or slow transmission at an earlier stage of an epidemic [1]. Recently, due to the spatial expansion of emerging vectorborne diseases and the difficulty to monitor locally the presence of infectious agents, their vectors and their hosts, epidemiologists are adopting new remote sensing techniques to predict vector habitats based on the identification, characterization and management of environmental variables such as temperature, humidity and land cover type [1,2]. Based on this strategy, satellite imagery such as those from Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), Landsat, Moderate Resolution Imaging Spectroradiometer (MODIS), and Advanced Very High Resolution Radiometer (AVHRR), have been used to propose preventive strategies for mosquito-borne diseases [3][4][5][6][7]. Ticks are obligatory parasites of vertebrate hosts and depending on their hosts for movement over large distances. Consequently, tick-borne infectious diseases can be spread over geographical areas by the movement of tick hosts carrying either infected ticks or transmitting diseases to susceptible ticks in neighboring locations. In this paper we proposed the use of satellite imagery to support tick-borne infectious diseases surveillance based on the expansion of amplifier host's food sources.
Rickettsia rickettsii is the etiological agent of the most severe form of rickettsiosis, referred to as Brazilian spotted fever (BSF) in Brazil. The disease also occurs in the United States, Mexico, Costa Rica, Panama, Colombia, and Argentina. In South America, the tick Amblyomma cajennense sensu lato (s.l.) is the main vector, but other tick species, such as Rhipicephalus sanguineus and Amblyomma aureolatum, are also involved in more restricted areas [8]. R. rickettsii is partially pathogenic to ticks so the infection rate drops in the tick population with each tick generation [9]. Furthermore, less than 50% of A. cajennense s.l. maintain transovarial and transestadial infections and are unable to keep the infection by R. rickettsii efficiently in successive generations [10]. Thus the participation of amplifier hosts for the maintenance of the bacteria becomes essential to guarantee a constant development of new generations of infected ticks [9,11]. In South America, it was recently found that the capybara Hydrochoerus hydrochaeris acts as the amplifier host of R. rickettsii infection for the tick A. cajennense s.l. [12,13]. Once infected, the capybara keeps the R. rickettsii in the bloodstream for 7 to 10 days, when the infection of susceptible ticks that feed on it may occur [13], generating new cohorts of infected ticks. In addition capybaras are prolific, producing a mean of six pups per female per year [14], generating a constant introduction of susceptible animals. In North America, where Dermacentor spp. ticks are main vectors of R. rickettsii, several small rodent species such Microtus pennsylvanicus, Microtus pinetorum, Peromyscus leucopus and Sigmodon hispidus can also maintained the R. rickettsii infection between ticks [15].
In Brazil, there may be a causal relationship between the rising capybara population and the re-emergence of the disease, since both capybara populations and the number of BSF occurrences have increased significantly over the last three decades [1,16]. Between the years 1989 and 2008, there were recorded 737 cases of Brazilian spotted fever with laboratory confirmation, of which 80.2% were in the southeastern region with a lethality of 31.4% [16]. In the state of São Paulo, southeastern Brazil, there were 555 confirmed cases between the years 1985-2012, with a lethality of 40.5% [17]. Additionally, in endemic areas for BSF in the state of São Paulo, population densities of capybaras have reached numbers up to 40 times higher than those recorded in natural environments such as the Amazon and Pantanal [18]. In this region, the constant availability of water resources (S1 Fig), is essential for establishment of capybara populations, since this is a semiaquatic vertebrate species that depends on water source for thermic regulation, reproduction (capybaras mate only in water), and predator protection [19]. On the other hand, the increase of capybara populations in these areas should depend primarily on availability of food sources, as typically known for rodents such as capybaras [19]. Sugar cane is well known as one of the most appreciated food by capybaras, and the economic impact of capybara on damage to sugarcane crops is a reality in the state of São Paulo [20]. In the state of São Paulo, sugar cane is indeed the most common farming [21,22]. Because sugar cane farming occurs throughout the year, regardless of season, it represents the main food source for capybaras in many areas [18]. Thereby, the increase in the availability of food sources, near watercourses, increases the habitat carrying capacity for capybaras, resulting in population growth of susceptible animals. The carrying capacity represents the maximum number of individuals that the environment can sustain indefinitely given the availability of vital resources (i.e. water, food, habitat) [23]. This in turn could modulate the transmission of diseases carried by these animals.
Among A. cajennense s.l. populations that are sustained by capybara hosts, it has been proposed that the maintenance of R. rickettsii depends primarily on a constant introduction of susceptible animals (i.e., newborn capybaras), which will act as amplifier hosts and will guarantee the constant creation of new cohorts of infected ticks [8,10,11]. Hence, the approach adopted in this paper is based on the premise that the expansion of cultivated sugarcane areas would translate into an increase of the population density of capybaras (i.e., higher reproduction index), and consequently on the higher exposure of humans to R. rickettsii-infected ticks (i.e., higher number of BSF cases). In this way, to support the BSF surveillance, we suggest a methodology based on the spatio-temporal monitoring of the increment and expansion of sugarcane crops, main food source for capybaras in the state of São Paulo, southeastern Brazil.

Study area
To verify if is possible to anticipate the occurrence of tick-borne infectious diseases using the satellite hyperspectral methodology proposed, which is based on monitoring the variation in amplifier hosts food sources, we used the data of the BSF occurrences in the state of São Paulo, southeastern Brazil, region in which the emergence of this disease is associated with the rising capybara population. This study area located at coordinates 19°44 0 S to 24°28 0 S 44°05 0 W to 53°31 0 W, covering an area of 248 808 km 2 . With 43,663,669 inhabitants and 625 territorial divisions, São Paulo is the most populous state of Brazil. The topography is characterized largely as a plateau (90%), with altitudes that vary from 300 m to 900 m. The region has a tropical climate, with a hot and humid summer (October to February) and dry winter (June to August).

Brazilian Spotted Fever occurrence
Data collection. The occurrence of Brazilian spotted fever in the state of São Paulo from 2000 to 2012 were obtained from the web site of the São Paulo State Center of Epidemiologic Surveillance [24]. We considered only the BSF cases that occurred in areas of transmission by A. cajennense s.l., as previously determined [25]. The human population of each São Paulo territorial division was obtained from the website of the Brazilian Institute of Geography and Statistics [26].
Retrospective Spatial analysis. A retrospective high rate spatio-temporal statistic based on a discrete Poisson distribution, implemented in a software SaTScan, was used for the identification of spatial clusters and risk areas for BSF occurrence. This method produced a set of clusters, the relative risk in the different clusters, and a corresponding p-value for each cluster based on Monte-Carlo simulations.
The statistic uses a circular window of variable radius that moves across the map. For each circle a likelihood ratio statistic is computed based on the number of observed and expected cases within and outside the circle and compared with the likelihood, L 0 [27]. We used a likelihood function under the alternative hypothesis assuming Poisson distributed cases proportional to: where γ and E(γ) represent the observed and expected number of cases in a circle and N − γ and N − E(γ) the observed and expected number of cases outside the circle. N is the total number of cases. The indicator function I() is equal to 1 if the observed number of cases within the circle is larger than the expected number of cases given the null hypothesis and 0 otherwise [27]. The circles with the highest likelihood ratio values are identified as potential clusters. An associated p-value, based on 999 Monte Carlo simulations, was computed and used to evaluate whether the cases are randomly distributed in space or otherwise. We included only primary clusters, as long as their corresponding p-values were less than 0.05.

Hyperspectral / multitemporal analysis
Satellite imagery obtaining. To detect increment and expansion of sugarcane crops we used time series from the Enhanced Vegetation Index (EVI) for the entire state of São Paulo from 2000 to 2012. The EVI imagery were obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor on board the Terra satellite of the National Aeronautics and Space Administration (NASA) [28]. The gathered images from MODIS, including 176 spectral bands with 250m (red, near-infrared), 500m (mid-infrared) and 1000m (thermal infrared) spatial resolutions. The vegetation spectrum typically absorbs in the red and blue wavelengths, reflects in the green wavelength, strongly reflects in the near infrared wavelength, and displays strong absorption features in wavelengths where atmospheric water is present [29,30]. These hyperspectral images spans the time period from 2000 to 2012 in 16 day increments. To prepare the images for the analysis, we created a mask to disregard screen brightness values equal to 0 and 255 to rule out specific areas, such as water, in our analyses. All procedures using satellite images were executed using the software ENVI 5.2.
Principal Component Analysis. The principal component analysis was used to transform the original hyperspectral image data, highly correlated, into a new set of uncorrelated variables called principal components. By rotating the coordinate system to align with orthogonal dimensions of uncorrelated variance, any location-specific pixel time series P xt contained in a N image time series can be represented as a linear combination of temporal patterns, F, and their location-specific components, C, as: where C ix is the spatial Principal Component (PC), F it is the corresponding temporal Empirical Orthogonal Function (EOF) and i is the dimension [31][32][33][34][35]. The EOF's are the eigenvectors of the covariance matrix that represent uncorrelated temporal patterns of variability within the data [31]. Thus, the principal components can be determined by computing the eigenvectors and eigenvalues of the covariance matrix.
Linear Spectral Unmixing Analysis. The relative abundance of sugarcane crops was depicted in the hyperspectral imagery based on the endmembers spectral characteristics, using linear spectral unmixing. Endmembers are a collection of constituent spectra corresponding to distinct ground substances [36]. Each location-specific (x) pixel P xt in an N image time series can be represented as a linear combination of D 0 temporal endmembers, E it , and a residual component, ε, as: where the pixel-specific fractions f ix may represent either the areal fraction of the pixel exhibiting the temporal pattern of the corresponding endmember, or more generally, the Euclidean proximity of that pixel to the corresponding endmember in the temporal feature space. In a temporal unmixing model, each pixel is the linear combination of different temporal endmembers and corresponding fractions [31]. The result is a set of fraction maps representing the spatial distribution of different endmember abundances so, the pixel values indicate the fraction of the pixel that contains the endmember material corresponding to that image.

Results
The Because of these two temporal clusters found, all following analysis will be presented considering these two periods. Respectively, the number of cases were 178 and 208. The relative risk of the spatial cluster concerning the first period was 8.54 and the relative risk of the three spatial clusters concerning the second period were 1.64, 3.44 and 9.39 as is also shown in Fig 1. Using principal component analysis, we quantify the spectral dimensionality of the global composite and render the mixing space to select the endmembers. The eigenvalues that derive the amount of variation accounted by each principal component within the feature space, show that nearly 95% of the information within this feature space was found in the first six principal components (66.3% in channel 1, 11.9% in channel 2, 6.3% in channel 3, 4.4% in channel 4, 3.4% in channel 5, 2.4% in channel 6) and for this reason the following analysis considers only the first six principal components. After rendering the mixing space from the six low order principal components we identify three endmembers corresponding to Atlantic vegetation, sugarcane crops and substrates. Substrates includes rock, sediment, soil and non-photosynthetic vegetation and in the case of São Paulo some rivers are also detected as substrates due to the high contamination of water sources. The first four principal components have an annual or biannual frequency change that represents the main vegetation phenology changes over the area. In contrast, the other principal components exhibit poorly expressed annual frequency variability because the corresponding phases stem from noise or substrate, or only represent phenology in small areas. S2 Fig is shown the spectral reflectance information for each endmember selected through the study period. It is shown that the Atlantic vegetation is  [21,22] and from the Canasat-Area Project of the Brazilian National Institute for Space Research [37], which mapping the sugarcane distribution of the state of São Paulo once a year using remote sensing imagery acquired by the Landsat, CBERS and Resourcesat-I satellites [37].
To verify if the increment of the sugarcane crops was associated with the presence of BSF, the pixel abundance mean values were compared in the four risk areas for BSF occurrence between both periods. cluster detected from 2000 to 2006 was higher than the sugarcane crops pixel abundance mean (120.13) of this zone from 2007 to 2012, nevertheless this difference was not statistically significant. Consistently, the sugarcane crops pixel abundance mean was 1.4 times higher in the risk zones for BSF occurrence. This indicate that the increment and expansion of sugarcane crops coincide with BSF risk areas. Nevertheless, the reduction of the sugarcane crops not necessarily corresponds with the disappearance of a BSF risk zones.

Discussion
Spatial epidemiology aims to investigate the spatial distribution of diseases in order to identify risk areas, determine ecological risk factors for disease transmission and strategically guide disease control actions [38,39]. In our study, four significant high-risk regions for BSF were identified by the spatio-temporal scan statistic approach for two delimited time periods: 2000 to 2006 and 2007 to 2012. The circular scan statistic used in this study has been proved to be useful for many studies [40][41][42][43]. Gaudart et al (2005) compared the oblique decision tree model, a complex statistical technique with the Kulldorff's SaTScan cluster technique, which was used in this study. They produced similar results using both methods in a village in West Africa to identify malaria risk clusters [43]. The circular isotopic technique of the Kulldorff's SaTScan makes it a useful tool to detect clusters but has limitations when detecting irregular shaped clusters due to its fixed scan window [44,45].
The suitability of an area for vector-borne disease transmission depends on environmental variables such as temperature, humidity and land cover type. Because these factors can be monitored using remote sensing, it is possible to construct models that predict regions and periods favorable for the presence of the vectors, their hosts and the diseases carried by these. In addition to satellite-derived climate variables, many recent epidemiological studies have made use of the link between vegetation amount and vector populations via the application of normalized difference vegetation index (NDVI) imagery [46]. Due to the essential involvement of the capybaras in the BSF transmission, we used the MODIS enhanced vegetation index (EVI), which offers improvements over the NDVI, to identify sugarcane crops increment and subsequently link this increment with the occurrence of BSF, since we rely on the premisse that higher food availability increases capybara reproduction index, and consequently, higher number of R. rickettsii-infected ticks [8,10,11]. Many remote sensing approaches have been developed to identify cropping practices, such as crop type and cropping intensity, across large spatial and temporal scales [47][48][49][50]. Studies have used mainly Landsat data to assess crop type and cropping intensity. However, the accuracy of the Landsat method depends on several different factors such Landsat image availability for the appropriate time period during crop growth cycles [51]. The low temporal resolution of Landsat and its susceptibility to missing values, and the coverage required to assess crop type and cropping intensity is not assured in most agricultural regions, particularly in the tropics where there are often periods of intense cloud covering [52]. High-temporal resolution data from MODIS are of great relevance for modeling the transmission of vector-borne disease since they fill these gaps and allow an assessment of vector and disease distribution and their potential spread.
Studies have been conducted with remote sensing to identify landscapes linked to a higher risk of emerging vector-borne diseases [53,54]. These studies showed that many factors influence the distribution of ticks and tick-borne diseases, including changes in human activities, biotopes, animal abundance and animal distribution. Therefore, to identify populations, regions or periods at risk for tick-borne diseases, forecasting models must account for many parameters [55]. We demonstrated that areas of BSF occurrence matched significantly with the areas of increment of sugarcane crops, main food source for capybaras, amplifier host for R. rickettsii. In the state of São Paulo, this single parameter remains constant throughout the year and because of this we do not considered other external factors. In the United States a reduction in the biodiversity due to deforestation and forest fragmentation lead to an increase in the density of white-tailed deer and white-footed mouse and their attendant ticks, leading to the emergence of Lyme disease over the last several decades [4,6,[56][57][58].
Our study shows that it is possible to detect high-risk areas for BSF through hyperspectral satellite imagery. We suggest that such analyses will be beneficial by enabling surveillance strategies to be focused on the highest risk regions for BSF. However, several factors need to be considered for these technologies to be routinely adopted for public health management such as, the availability of resources for gathering, processing, and modeling geospatial data, training of personnel on the proper interpretation of results and the continuous availability of remote sensing data in a timely manner [1]. We believe the method could have similar benefits if applied to other vector-borne data collected from other regions and could thus help to improve national surveillance networks for human rickettsiosis.
While our data suggest that the increment in sugarcane crops, observed in Brazil in recent decades had a spatio-temporal relationship with to the occurrence of BSF, our study cannot rule out that other factors that are correlated with the risk of R rickettsii infection are also causal for the observed association. Indeed this ecological study does not provide individual-level analysis.

Conclusion
Our initial hypothesis, that there is a relationship between the sugarcane crops increment and the occurrence of BSF, was objectively reinforced in this paper. In this way, in order to anticipate the occurrence of BSF in the state of São Paulo, we suggest a methodology based on the monitoring of the increment and expansion of sugarcane crops, main food source for the R. rickettsii amplifier host. Thus, when facing difficulty in monitoring locally the presence of R. rickettsii, their vectors or their hosts, the remote sensing could help to associate the presence of the disease with environmental changes. The methodology proposed can guide epidemiological surveillance programs in hotspot areas.