Detecting space-time clusters of dengue fever in Panama after adjusting for vector surveillance data

Long term surveillance of vectors and arboviruses is an integral aspect of disease prevention and control systems in countries affected by increasing risk. Yet, little effort has been made to adjust space-time risk estimation by integrating disease case counts with vector surveillance data, which may result in inaccurate risk projection when several vector species are present, and when little is known about their likely role in local transmission. Here, we integrate 13 years of dengue case surveillance and associated Aedes occurrence data across 462 localities in 63 districts to estimate the risk of infection in the Republic of Panama. Our exploratory space-time modelling approach detected the presence of five clusters, which varied by duration, relative risk, and spatial extent after incorporating vector species as covariates. The Ae. aegypti model contained the highest number of districts with more dengue cases than would be expected given baseline population levels, followed by the model accounting for both Ae. aegypti and Ae. albopictus. This implies that arbovirus case surveillance coupled with entomological surveillance can affect cluster detection and risk estimation, potentially improving efforts to understand outbreak dynamics at national scales.

Introduction Dengue disease, a viral infection transmitted to humans by Aedes mosquitoes, is endemic to 128 countries, with 3.9 billion people considered at-risk globally [1]. Dengue disease cases have increased dramatically worldwide throughout the previous several decades [2], likely a result of urbanization [3], globalization [4], climate change [5], the breakdown of regional control programs [6], and the spread of the invasive Aedes albopictus [7]. As a result of both recent and historical risk, many countries employ national surveillance programs to monitor trends in dengue and inform local health authorities to the places and times where preventative practices are most required. However, despite the commonality of these programs and unforeseen cost of cutting them [8], surveillance budgets are often limited [9,10], restricting the scope and quality of the work. This is concerning in developing regions such as Central America, where the burden of disease is high [1] and per capita public health expenditure is among the lowest of any region of the world [11].
Surveillance of both diseases and vectors is an essential component of integrated disease management programs that can be used to determine risk changes in space and time, thus providing the evidence for more targeted prevention and control interventions [12]. Nevertheless, with few exceptions, it is rare for surveillance programs to concurrently monitor both arbovirus cases and vector populations in the same locations and at regular intervals. Most projections of disease risk used to justify public health actions are derived purely from disease case data, ignoring vector population dynamics, which is a key aspect of the vector transmission model. This is particularly concerning when more than one vector species is present, and little is known about their likely role in local transmission, which may result in inaccurate or incomplete risk projection or case clustering models.
The Republic of Panama has been monitoring dengue cases alongside vector presence through the National Department of Epidemiology (NDE) since 1988, making it one of the most long-standing surveillance programs of its kind in Latin America. Of the two known dengue mosquito vectors, Ae. aegypti is considered resident to Latin America and Panama since the 19 th century, and the primary source of transmission [12] while Ae. albopictus, considered a secondary vector, has been spreading throughout the region ever since it got introduced in Panama in 2004 [13,14]. Widespread extirpation of Ae. aegypti by a superior ecological competitor like Ae. albopictus has occurred throughout the world in recent decades [15][16][17], with unknown consequences on arbovirus transmission risk. Encompassing this period of growing interspecific competition among two vector species, Panama's surveillance system is particularly unique and potentially useful to modelling dengue transmission risk while considering Aedes species interaction. Attaining a better understanding of dengue outbreak dynamics over time may improve the capacity of public health authorities to combat the spread of other arboviruses, such as Zika Virus and Chikungunya Virus.
Our overall aim is to examine the influence that concurrent dengue case surveillance and Aedes species monitoring can have on cluster detection and relative risk estimation. Based on previous studies incorporating covariates into spatiotemporal cancer cluster detection [18,19], we hypothesize that the inclusion of vector surveillance data will alter the identification of dengue clusters in space-time. In so doing, we describe the results of 13 years of dengue and Aedes surveillance data, including two competing vector species plus virus data originating from long-term cooperatively organized surveillance programs. We believe this is the first effort to adjust for vector presence and absence in a vector-borne disease cluster detection model, which we hope sheds light on the characteristics of space-time clusters and relative risk estimation of dengue after Aedes species are used as model covariates.

Dengue data
We utilized dengue prevalence data collected by the National Department of Epidemiology (NDE), housed within the Panamanian Ministry of Health (MINSA) [20]. Systematic national surveillance of dengue cases in Panama have been continuous since 1988. Suspected cases are defined by a patient with a fever and one or more of the following symptoms: headache, retro orbital pain, myalgia, exanthema, rash, vomiting, malaise, leukopenia, and jaundice. A confirmed case is defined as a suspected case with a positive dengue test, conducted using either viral isolation, reverse transcription polymerase chain reaction (RT-PCR), IgM enzyme-linked immunosorbent assay platform (ELISA), or secondary IgG ELISA. RT-PCR was established as the original standard by the National Reference Laboratory at the Gorgas Memorial Institutes for Health Studies (ICGES) in 2003. Yet since 2009, MINSA established national decentralization of serological confirmation of dengue using ELISA tests, which has improved efficiency by allowing district health officials to confirm cases without needing to send samples to a single central facility in Panama City. Data is recorded at the Corregimiento, or neighborhood, scale as the number of confirmed cases in a given year at a given location. This is the lowest scale of data granularity available, and thus, we do not have patient-level detail nor temporal detail at smaller units than year.

Vector data
We utilized vector data from the Vector Control Department (VCD) at MINSA. Systematic entomological surveillance has occurred in Panama since 2000 in order to establish Aedes infestation rates, and thus, areas of potential dengue transmission risk. Surveys of both Ae. aegypti and Ae. albopictus are performed annually at the Corregimiento-scale and consist of solely larval surveillance. Each year, a random block of houses is chosen and all houses in the block are searched for containers holding Aedes larvae. The larvae are collected and allowed to mature to the fourth instar, at which point they are taxonomically identified to species based on morphological keys [21]. The number of houses positive for Ae. aegypti, Ae. albopictus or both are recorded in the raw datasets. However, because there is no record of the number of houses in each block, we have transformed the data into a presence-absence format in each Corregimiento rather than analyzing the number of positive houses.

Data analysis
We conducted our analyses on dengue and vector data from 2005-2017, encompassing the period in Panama when both Ae. aegypti and Ae. albopictus have been interacting. Overall, data was collapsed from the original Corregimiento scale to the district scale. This is due to unreliable human population estimates at scales smaller than the district. Human population data was gathered from the National Institute of Statistics and Census (INEC), which conducts a national census every 10 years. Because the national census is only conducted every ten years, we used the population levels from 2010 to calculate PR for data from 2005-2017. While this is not ideal, and incurs inherent error in the year to year accuracy of the PR estimate, there is no more frequent population estimate available. This is an unfortunately common situation, especially in Central America, where no country conducts national population assessments more frequently than every 10 years.
For the spatial analyses, we utilize discrete Poisson space-time modelling STSS [22], which systematically moves cylindrical search windows across the geographic and temporal space to detect space-time clusters. Essentially, STSS determines if the observed disease cases in a particular region and time period exceed the expected cases under baseline conditions. In vectorborne disease research, STSS have been used to examine outbreaks of dengue [23][24][25], chikungunya [26], malaria [27,28], Chagas [29], and West Nile [30,31], for example. STSS have also been used to examine the co-circulation of dengue and chikungunya in Colombia [32]. However, none of these prior studies have utilized vector surveillance data as a covariate.
The cylinders are centered on the centroids of the Panamanian districts while the base of a cylinder is defined as the spatial scan, and the height of a cylinder represents the temporal scan. The number of observed and expected dengue cases are computed for each cylinder. Conceptually, a vast number of cylinders of various space-time dimensions are generated until an upper bound is reached, while each cylinder is a potential cluster. For this study, the maximum spatial scan was set to 25% of the total population in Panama, while the maximum temporal scan was set to 4 years. A Poisson-based likelihood ratio is calculated for each cylinder, which is proportional to ðn=mÞ n ½ðN À nÞ=ðN À mÞ� NÀ n [33]. For the parameters, μ is the expected number of dengue cases in a cylinder, and n is the total observed dengue cases in the cylinder. The expected number of dengue cases is computed by multiplying the fraction of population that lives within the cylinder (p) by the total number of cases in Panama (C) divided by the total population (P), that is: E[c] = p � C/P -The cylinder with the highest likelihood ratio is the most likely space-time cluster. To evaluate the statistical significance of the candidate space-time clusters, 999 Monte Carlo simulations are performed under the null hypothesis that there are no significant clusters. Subsequently, we report secondary space-time clusters with a p-value less than 0.05. It must be noted that the cylindrical shape of the clusters does not represent the true shape of the clusters. While it is possible to use irregular search windows [34][35][36], cluster borders in the model outputs are not perfectly in line with the borders of the risk in reality.
For this study, we ran four STSS models: (1) dengue cases only; (2) dengue cases controlled for the presence and absence of Ae. aegypti and/or Ae. albopictus (i.e. absence of both species, Ae. aegypti presence, Ae. albopictus presence, and presence of both species); (3) dengue cases controlled for Ae. aegypti presence/absence only; and (4) dengue cases controlled for Ae. albopictus presence/absence only. For the covariate adjusted models, the expected number of dengue cases is defined the same way for the non-adjusted model, but includes covariate category i. That is: In other words, the adjusted STSS searchers for clusters "above and beyond that which is expected due to these covariates" [37]. For each model, we also report the relative risk of prevalence in each district that belongs to a space-time cluster, which is defined as ðc=eÞ=½ðC À cÞ=ðC À eÞ�, where c is the total observed dengue cases in a particular district; e is the expected cases in a district; and C is the total observed dengue cases in the country of Panama. Clusters with a relative risk > 1 indicates that there were more observed dengue cases than expected under baseline conditions. We created all maps in ArcGIS [38].

Results
From 2005-2017, there were a total of 49,910 cases of dengue in Panama (Fig 1), with 2009 and 2014 being the most severe at 6,941 and 7,423 cases respectively. These two years represented 28% of the total dengue cases during the 13-year period. The spatial distribution of the total number of dengue cases and the crude rate per 10,000 people per district is shown in Fig 2A and 2B  Panama, and Bocas del Toro (758.59) and Changuinola (298.74) in northwestern Panama (Fig 2A and 2B). Additionally, surveillance of each vector species indicated prolonged endemic presence of Ae. aegypti alongside increasing presence of Ae. albopictus from just one district in 2005 to 53 in 2016 (Fig 3).
The results of our space-time modelling detected the presence of five clusters in each of the four models, varying by cluster center and duration (Figs 4-7 and Table 1). Incorporating covariates into the models had considerable effects on the duration, relative risk (RR), and spatial extent of clusters ( Table 2). The model adjusting for the presence of Ae. aegypti encompassed the greatest spatial range and highest number of districts with a RR > 1, while the model adjusting for the presence of Ae. albopictus encompassed the smallest spatial range and the lowest number of districts with a RR > 1. The duration of the space-time clusters is notably different when adding the vector surveillance data to the model, however, the one exception is cluster 1 for each model (most likely cluster). For example, the duration of cluster 2 was 2015-2017 for the no covariate and Ae. aegypti model; while the Ae. albopictus and Aedes (both) model reported a duration of only 1 year, which occurred six years earlier (2009). Furthermore, cluster 2 was found in different geographic locations for the Aedes (both) and Ae. albopictus models. This variation in duration of the clusters between the four models is a result of adjusting for the presence of Aedes during the 13-year study period. In other words, the start, end, and duration of the clusters is substantially affected by the presence of one or more Aedes species. The relative risk may be higher if Aedes was found in a district during the entire duration of a space-time cluster. During the 13 years of our study period combined with the 63 districts containing data (13 � 63 = 819), Ae. aegypti was present 690 times, Ae. albopictus was present 245 times, while both Aedes species were found in a district 224 times. As a result, the difference in species presence during the study period partly explains why the clusters for the Ae. albopictus model contained 19 less districts than the Ae. aegypti model, and 10 less districts that the model adjusting for both species. Using vector data to detect space-time clusters of dengue fever

Discussion
Overall, our results highlight the extreme heterogeneity of dengue is Panama. We find that clusters and risk vary widely across both time and space, with adjacent districts and years experiencing vastly different rates of transmission. While this may complicate response programs that operate under assumptions of spatially and temporally homogenous risk, it highlights the need for comprehensive risk projection studies which can identify particularly regions that most require attention. Additionally, our results specifically highlight the changes incurred by adding vector data into systematic dengue risk projections. In the model where Ae. aegypti presence and absence was accounted for, more than double the number of districts were contained in clusters than the model where Ae. albopictus presence and absence was accounted for. The Ae. aegypti model also contained the highest number of districts with a relative risk > 1, indicating more dengue cases than would be expected given baseline population levels. As an invasive species that has systematically replaced Ae. aegypti throughout numerous regions in its endemic range [15], Ae. albopictus has been spreading throughout Panama for the previous 13 years [13,14]. Globally, while Ae. albopictus has been implicated in several small outbreaks [39], the majority of dengue serotypes are thought to be transmitted by Ae. aegypti, due to its preference for both urbanized habitat [17,40] and human hosts [41,42].
Overall, based on our findings, we suggest that efforts be undertaken to further understand how the incorporation of vector surveillance data can affect risk projections, especially given the ubiquity of exploratory space-time scan statistics studies in the literature that did not utilize covariate surveillance data [26,43,44]. SATSCAN results can be useful in assessing risk when data granularity is low, as is often the case in developing regions with low-resource surveillance programs. Overall, these results can direct efforts to attain finer scale data in regions where general risk is deemed the greatest, at which point more confirmatory models can be applied, validating the role of certain covariates in space-time risk estimation. Balboa, for example, was identified as a cluster in all four models and had a steady presence of Ae. aegypti throughout the sample period as well as increasing presence of Ae. albopictus since 2006. This district is semi-urban with approximately 2400 people spread across 400km 2 area. Another district, San Miguelito in metropolitan Panama City, contained the most observed cases during our study period, despite being only 49.9km 2 . This district can be characterized by high density housing and residents of relatively low socioeconomic status, which has previously been linked to increased vector-borne disease risk [45][46][47][48]. The staggering number of cases should be a cause for concern, yet its small geographic area may facilitate public health interventions such as vector control and community education. Overall, now that the identification of high risk districts at the national scale has been completed and informed by vector presence, the subsequent step of illustrating the comparative characteristics of each district relative to dengue transmission risk can be undertaken. Understanding what caused Balboa and San Miguelito to experience such high relative risk, for example, is the next task necessary for adjusting public health interventions to effectively address the needs and conditions of each district.
Despite the longevity of our data and thoroughness of the surveillance efforts, there are clear considerations and limitations of our work which we would like to see addressed in future studies. First, we recognize that there is no way to statistically compare the outputs of the four models. This exploratory analysis was meant to illustrate that vector surveillance data can impact the location and duration of detected clusters, yet ranking the four models by predictive power is not possible with the statistical methods employed. Second, we recognize the spatial uncertainty by using districts for the analysis and our study is subject to the modifiable areal unit problem (MAUP) [44], however, this is an exploratory study that seeks to guide targeted interventions. Finer-level data can identify the locations (e.g. towns, blocks, homes) within each district that are experiencing the greatest burden of dengue, yet since such fine scale surveys cannot realistically be employed at the national-scale, studies such as ours can indicate where to direct future surveillance efforts. Third, it is possible that the reported cases of dengue in certain districts are travel cases (seeking treatment in a district different than actual residence), and while adjusting for the presence of Aedes can shed light on the districts where an individual is more likely to get infected with dengue, we cannot confirm that every patient was specifically infected in Panama. The lack of population data for more than one year across such a lengthy period is another limitation of this work. While the frequency of a census in Panama is on par with much of Latin America, this greatly impacts our ability to determine accurate prevalence rates year to year. Since linear interpolation is often inaccurate for non-linear trends like population growth rate, we would like to see more frequent population assessments conducted in regions where dengue is an ongoing risk, and while we understand that resources may not easily allow for this, the role of national census efforts in public health is often under-appreciated. An additional limitation originates with the vector surveillance methods employed. Values are reported as the number of houses containing larvae of each respective species. No information is given on the number of houses surveyed, and thus we were forced to transform the data into presence and absence. Had the total number of surveyed houses been reported, we would have been able to compute each district's infestation rate, which would have provided a scaled and more nuanced covariate in the model. Lastly, no data was available on short-term cross immunity and long-term serotype specific immunity, which may have incurred bias in the model parameter estimations.
Overall, it is key to recognize that adding vector surveillance data as a covariate changes the location, duration, and relative risk of dengue case clusters. As methods such as SaTScan are often used in mapping areas for public health intervention, we suggest that future efforts to utilize these tools in spatial vector epidemiology are conducted with an awareness that outputs can change when vector data is utilized as a covariate. Maps of clusters created without vector surveillance data may be missing key information that alters the distribution of the clusters in space-time, thus creating possible situations where public health efforts may not be planned in the areas which require it most. Although unadjusted cluster analysis is a valuable and commonly utilized tool for public health officials to identify high risk areas of vector-borne disease, our study illustrates the role that incorporating relevant covariates can play in altering the model output. While this has been demonstrated in cancer cluster studies [48,49], this is the first use of covariates in space-time cluster detection modelling of neglected tropical disease. With this comes potential to expand into other classes of covariates. For example, in addition to vector surveillance data, we support the incorporation of additional covariates such as vector genetic background, climate, vegetation, and land cover to dengue cluster models. Host population characteristics, such as housing density, relative isolation, and connectivity may also influence dengue risk in space-time scan statistic models. In general, the dengue transmission model contains numerous variables that we would have been interested in incorporating, had adequate data been available. Still, we demonstrate that vector surveillance clearly provides valuable information in the determination of virus case clusters, and thus should be conducted alongside virus surveillance so that it may be included in modelling efforts.