Use of Wild Bird Surveillance, Human Case Data and GIS Spatial Analysis for Predicting Spatial Distributions of West Nile Virus in Greece

West Nile Virus (WNV) is the causative agent of a vector-borne, zoonotic disease with a worldwide distribution. Recent expansion and introduction of WNV into new areas, including southern Europe, has been associated with severe disease in humans and equids, and has increased concerns regarding the need to prevent and control future WNV outbreaks. Since 2010, 524 confirmed human cases of the disease have been reported in Greece with greater than 10% mortality. Infected mosquitoes, wild birds, equids, and chickens have been detected and associated with human disease. The aim of our study was to establish a monitoring system with wild birds and reported human cases data using Geographical Information System (GIS). Potential distribution of WNV was modelled by combining wild bird serological surveillance data with environmental factors (e.g. elevation, slope, land use, vegetation density, temperature, precipitation indices, and population density). Local factors including areas of low altitude and proximity to water were important predictors of appearance of both human and wild bird cases (Odds Ratio = 1,001 95%CI = 0,723–1,386). Using GIS analysis, the identified risk factors were applied across Greece identifying the northern part of Greece (Macedonia, Thrace) western Greece and a number of Greek islands as being at highest risk of future outbreaks. The results of the analysis were evaluated and confirmed using the 161 reported human cases of the 2012 outbreak predicting correctly (Odds = 130/31 = 4,194 95%CI = 2,841–6,189) and more areas were identified for potential dispersion in the following years. Our approach verified that WNV risk can be modelled in a fast cost-effective way indicating high risk areas where prevention measures should be implemented in order to reduce the disease incidence.


Introduction
West Nile virus (WNV) is a mosquito-borne flavivirus with increasing numbers of reported human disease cases worldwide. In Europe, cases of WNV associated diseases have been reported in several countries in the European Union and in bordering Non-E.U. countries. The largest ongoing European outbreak has been observed in Greece, with more than 524 confirmed cases of human infection and 60 deaths reported since 2010 [1] (Figure 1).
Many studies have associated the presence of specific environmental factors with areas at high-risk for WNV transmission in the USA [2][3][4][5] and Europe [6,7]. Tachiiri et al. (2006) developed a model using basic geographic and temperature data to assess WNV risk in British Columbia [8]. Ruiz et al. (2004) used several factors related to the physical environment such as elevation range, physiographic region, and percentage of vegetation cover to determine WNV risk during an outbreak in the Chicago area in 2002 [9]. Methods that have been used in WNV risk modeling include non-linear discriminant analysis [10], logistic [11] or multiple regression models [12] (differential and difference equation modeling [13,14] and cluster analysis [15]. Predictive modeling with Geographic Information System (GIS) can be used to analyze environmental determinants of WNV transmission and determine high risk areas. Most previous WNV risk analyses utilized spatial statistical techniques (mapping clusters, geographic distribution, spatial relationships-regression models) to correlate environmental, climatic and socioeconomic factors with WNV prevalence [2,4,5,9,11,16].
The geographical position of Greece in the Mediterranean peninsula makes it an important transit zone for migratory birds [17]. Greece hosts a wealth of biological diversity, one of the richest in Europe and the Mediterranean.
The main objective of this study was to correlate serological data of exposure of wild birds to WNV and reported human cases data during the Greek outbreak with potential environmental risk factors within a GIS, in order to construct predictive maps identifying areas at risk from further spread. We test the predictive power of the models against recent outbreak data and identify high risk areas for the application of targeted, timely and cost-effective prevention measures such as surveillance, mosquitoes control and campaigns to increase public awareness of the disease.

Ethics Statement
This study was co-funded by the «Integrated Surveillance and control programme for West Nile Virus and malaria in Greece» (National Strategic Reference Framework, 2007-2013) and was also part of European Union Seventh Framework Programme (2007-2013) large collaboration project under grant agreement no. 222633 (Novel Technologies for Surveillance of Emerging and Re-emerging Infections of Wildlife -WildTech). University of Nottingham is the project coordinator, and has received signed cooperation agreements by all the project partners, for providing animal samples to the project. All samples used in this project represent material collected by partners and other organisations for other purposes than this project as specified in deliverable D4.5/5.5 entitled ''Guidelines for ethical sample collection'' submitted to European Commission (26/02/2010, Dissemination Level: PP, Restricted to other programme participants, including Commission Services). The avian samples were collected opportunistically (no active capture, killing and sampling of wild animals specifically for this study was performed) from animals hunterharvested by members of Greek Hunting Federation of Macedonia and Thrace, from species considered quarry and during the  hunting seasons, according to the prerequisites of the Greek  Legislation (FEK 1611/B9/2009, FEK 1183/6-8 is not a part of the study. The human data were part of the ongoing surveillance of human cases performed by the Hellenic Center for Disease Control and Prevention (HCDCP) and were reported from the treating physicians. Data on human cases is maintained on a database kept by the HCDCP that was completely anonymized to the authors, without being publicly available. Since residential addresses could be potentially identifying, the security of this database was maintained according to the national regulation for the confidentiality of human data. The use of the residential addresses has been approved by the Institutional Review Board (IRB) of the Public Health and Environmental Hygiene post graduate course of the Laboratory of Hygiene and Epidemiology, Faculty of Medicine, University of Thessaly, Greece. The IRB waived the need for written informed consent to use the residential addresses, with the restriction that the analysis would be carried out at the municipality level.

Study Area
The study area comprised the entire country of Greece. Greece occupies the southeastern part of Europe with a total area of 131,990 km 2 . Eighty percent of Greece consists of mountains; the country is characterized by a large climatic diversity (29 climatic zones according to the Thorn Waite classification), by its extensive coastline of about 15,000 km and many island complexes in the Archipelagos of Aegean Sea and the Ionian Sea. Climatic conditions of the country are typical Mediterranean: Summer is hot and dry while winter is usually mild. Rain mostly falls in autumn and winter.

WNV Human Cases Data
Reported human WNV cases in Greece (2010-2012) were provided by HCDCP. Most cases were serologically confirmed by the presence of IgM antibodies in the serum and/or the cerebrospinal fluid. Residential address of each human case was used for geocoding and mapping the cases.

Wild Birds Surveillance
A total of 620 avian serum samples were obtained from wild birds hunter-harvested by members of the Greek Hunting Federation of Macedonia and Thrace, from species considered quarry during the 2009/2010, 2010/2011 and 2011/2012 hunting seasons (from 20 August until 28 February the following year), according to the prerequisites of the Greek Legislation. All available samples were obtained from mainland Greece, opportunistically collected during regular hunting activities; samples were available from all 9 mainland regions of Greece (Table 1). Sampling effort was distributed in mainland Greece, avoiding cluster sampling biases, with the exception of the Central Macedonia region, the epicenter of the outbreak, during which a large number of samples were provided. Data on bird specimens that tested positive for WNV during the study were located in the field using handheld Global Positioning System (GPS) units or located by means of longitude and latitude information provided by samplers. Serological screening was performed as already reported [18][19][20]; a total of 64 resident wild birds were found positive for WNV antibodies, and were used in the current study (migratory wild birds were also found seropositive, but relevant data was excluded from the analysis, see Discussion).

Environmental Variables
Environmental variables for this study were derived from three main database categories: climate, elevation and land cover data.
WorldClim version 1.4 climate data [21] was obtained from the WorldClim website (http://www.worldclim.org). WorldClim is a set of global climate layers (climate grids) with a spatial resolution of 1 square kilometer. Topographic variables including altitude, aspect and slope were extracted from a digital elevation model (DEM) with a spatial resolution of 1 square kilometer (http://srtm. csi.cgiar.org/Index.asp). Land uses were derived from the Corine Land Cover 2000 database (European Environment Agency -EEA, http://www.eea.europa.eu/data-and-maps).
Village and vegetation corrections were digitized from 2007 and 2009 color orthophotos that were available through Web Mapping Service (WMS) (http://gis.ktimanet.gr). To create environmental layers (n = 37) for the analysis (Table 2), ArcGIS 10.1 GIS software (ESRI, Redlands, CA, USA) was used. GIS layers were created to represent factors like the locations of towns and villages, distance to the nearest village, distance from water presence etc. For many of the above parameters, we calculated neighborhood statistics for radii of 100, 200, 500 and 1000 m to determine which spatial scale affects the presence of cases most strongly. These data sets were converted to a common projection, map extent and resolution prior to use in the modeling program.

Statistical Analysis
We used data on 2010 and 2011 human cases for the statistical analysis and model building and kept the 2012 cases for verification. A total of 363 human WNV cases have been reported in Greece for the years 2010 and 2011 (262 cases in 2010 and 101 cases in 2011). The available dataset consisted of presence only data (presence: people infected by the virus). For this dataset, as well as the wild birds seroprevalence dataset a number of explanatory variables (n = 37) were collected and constructed, as mentioned previously ( Table 2).
Instead of constructing a number of pseudo-absence controls, a methodology which according to the literature has some significant disadvantages for the prediction modeling [22,23], we decided to search for within the presence data variation of the explanatory variables. We clustered the cases using the agglomerative method of Two Step Cluster Analysis, a method which allows for the utilization of both continuous and categorical variables and clusters the cases by measuring the log-likelihood distance among them [24]. The Two Step Cluster Analysis allowed us to check for a pattern of the virus among the infected people in 2010, in 2011, and in total. The optimal number of clusters was chosen using the Silhouette coefficient, a measure proposed by Kaufman and Rousseeuw (1990) [25]. The coefficient ranges from 21 to 1 and when its value is closer to 1, the clustering is considered efficient.
Before applying the above cluster method, we checked a number of descriptive statistical measures which describe our data. Although Two Step Cluster Analysis is robust to non-normality [24] we used Factor Analysis in order to reduce the number of available variables and to achieve normality and zero-correlation among explanatory continuous variables. We used the Principal Component Analysis (PCA) as a method of components extraction with rotation method the Varimax method with Kaiser Normalization [26]. Two Step Cluster Analysis was iterated several times using as clustering variables either the components which were extracted by the Principal Component Analysis, or the original variables which were highly correlated with the components. The extracted clusters for humans and the extracted clusters for birds were compared in terms of the variables that are important for clustering.

GIS Analysis
Two significant environmental variables were recognized from the statistical analysis (see Results) and were used to measure environmental conditions for the WNV locations of the seropositive wild birds and the human cases dataset. Mahalanobis distance (MD) [27] was used to develop a distance measure model for wild birds and predict WNV potential distribution prior to the expansion/outbreak of the 2012 period. We calculated MD with ArcGIS software, based on the values of the two significant variables, allowing us to identify suitable areas for WNV potential distribution and occurrence. Model performance evaluation was conducted with the 2012 reported WNV human cases, as provided by HCDCP.

Results
Data analysis demonstrates differences between 2010 and 2011 in terms of positive cases in Greece (Table 3). Fewer cases (n = 101) occurred in 2011 and the average case age was 5 years younger compared to 2010 (p-value = 0.024). There was a statistically different distribution in terms of the prefecture of residency of the positive cases, which were found in more southern areas compared to 2010, indicating the pathogen's continued spread in mainland Greece.
Finally, the distribution of the infected individuals in terms of date of infection was different in 2011, where more positive cases were found in July and September, compared to 2010 where the majority of the cases were found in August (64%).
Factor Analysis and Two Step Cluster Analysis revealed that altitude and distance from water were the two variables, among the 37 under study, which clustered significantly the cases. Both variables played a significant role in the clustering procedure. The two variables clustered in a similar way for both humans and birds. For the clustering of human cases, the average Silhouette coefficient was 0.5 which is considered a good clustering value [25]. The same value was achieved for the clustering of seropositive wild birds (Odds Ratio = 1,001 95%CI = 0,723-1,386). Three clusters were created for humans and birds (Figure 2), sharing the same attributes. In particular, humans' Cluster A and birds' Cluster B share the majority of the positive cases in humans and birds respectively (60%). There seems to be a pattern of WNV in Greece in places with low altitude and small distance from water. There are also two other clusters with lower percentages of cases which show that positive cases are also found in places with low altitude and big distance from water (23-24%) and in places with high altitude and small distance from water (almost 17%).
Relevant box-plots ( Figure 3) show how well the two variables discriminate in each cluster. A clear separation of the three clusters is seen in both groups of cases.
Regarding the 2010 human cases, clustering showed that low altitude and small distance from water were associated with the majority of the positive human cases as well. A total of 86.6% of Table 1. Available avian samples: species, migratory status and number of samples per region. the human cases were grouped in cluster A (Figure 2), which shares similar attributes with cluster A of human 2011 cases and cluster B of birds.
The potential geographic distribution of WNV, predicted by GIS and MD based on the attributes of the major clusters of reported human cases of 2011 and seropositive wild birds is displayed in Figure 4. Fragmented high-risk areas were recognized: Most were concentrated in the Macedonian prefecture, in western Greece as well as in Thessaly. Other suitable high risk areas were located along the coast line of the Peloponnese peninsula and Crete. Moreover, many Greek islands have suitable environmental characteristics such as Rhodes, Mytilene, Chios, Samos etc.
In the early transmission period (June 2012) we reported the high-risk areas recognised throughout this study to the Ministry of Public Health and to HCDCP. As already reported, in 2012, a total of 161 laboratory-confirmed human cases were reported. Out of these 161 cases, only 31 occurred far from WNV high-risk areas recognised by our model (Odds = 130/31 = 4,194 95%CI = 2,841-6,189); four (4) human cases out of 5 were reported in recognised high-risk areas while only 1 out of 5 was not. New areas of potential dispersion of the virus are also suggested for the following years in the areas of Thrace, the Peloponnese peninsula and several Greek Islands (Figure 4).

Discussion
Humans and other mammals, particularly horses, are alternative hosts for WNV; the main route of infection is through the bite of an infected mosquito. Most human infections remain asymptomatic with WNV fever developing in approximately 20% of infected people and West Nile neuroinvasive disease in ,1% [28]. Horses and humans develop low viremic loads (,10 5 PFU/ml) of short duration and thus are considered dead-end hosts for WNV [29]. In contrast, various migratory and resident avian species develop high viremic loads, sufficient to infect feeding ornithophilic mosquitoes [30]. Hence, the WNV life cycle is maintained with birds being the main amplifying hosts and mosquitoes the main vectors. Moreover, local movements of resident birds and long-range travel of migratory birds may contribute to pathogen dispersion [31,32]. In southern France, WNV was detected in late summer of 2000 and 2004. Migratory passerines were found with higher prevalence of WNV neutralizing antibodies (7.0%) than resident and short-distance migratory passerines (0.8%), suggesting exposure to WNV or a related flavivirus during overwintering in Africa [33]. Additionally in Spain it was found that Trans-Saharan migrant species had both higher prevalence and antibody titres than resident and short-distance migrants [34].    In Greece, the disease first appeared in Macedonia prefecture in 2010, with 262 confirmed cases and 35 deaths, and it subsequently spread through mainland Greece in the following years. More specifically, in 2011, the outbreak expanded southwards to central Greece with 101 confirmed cases and 9 deaths, while in 2012, a total of 161 confirmed cases and 18 deaths were reported mainly in Attica and northeastern Greece. A strain of lineage 2 was detected in 2010 in pools of Culex mosquitoes [35] and in wild birds [20]. In this study we correlate various environmental factors with WNV maintenance, amplification and potential for future spread in Greece.
Various public health studies have used Geographical Information System technologies as a tool for data analysis [2,9,15]. Previous studies [9,10] found that certain social and environmental factors were correlated with WNV dissemination patterns: The presence of vegetation, distance to a WNV positive dead bird, the intensity of mosquito abatement, demographic factors such as population age, race and financial status. Low precipitation and warm temperature were also found to associate with WNV cases. On the other hand, spread of WNV has shown some unique distribution patterns in different regions [15,30].
Before reaching the aforementioned results, we undertook several efforts to find out a pattern, or a distinguishable attribute of the WNV positive cases in Greece. Although we used a significant number of explanatory variables for describing each positive case (a mix of both continuous and categorical variables), there was no indication that these altogether could show the pattern in question. Therefore we tried to reduce this dataset by using Factor Analysis. We run PCA once for the temperature variables and once for the precipitation variables. Two components (93% of the variation was explained) for the temperature and two components (96% of the variation) for the precipitation variables were extracted, which means that the fit was very good. These four variables with the rest demographic and environmental variables were used in the Two Step Cluster method. This method was preferred compared to other clustering techniques because it can handle both categorical and continuous variables. However, we also run hierarchical cluster analysis using only the continuous variables, but no pattern of the positive human cases was revealed. Therefore, we used the two step cluster technique in a backward selection way. Initially we used all the explanatory variables together, and we removed one at a time if the Silhouette indicator was not considered good. We used the log-likelihood distance instead of Euclidean distance, because there were initially categorical variables in the dataset. However, when only the two continuous variables were left (''altitude'' and ''distance from water''), we checked also if the Euclidean distance could reveal the same pattern with the loglikelihood but it didn't. We believe the fact that some of the variables were not significant for the clustering procedure was due to similar environmental conditions existing in Greece during summer. For example, there is no significant variation in terms of temperature or precipitation. This is why more stable variables like distance from water and altitude were responsible for the form of the clusters. After we formed the three clusters with these two variables, this pattern was revealed for both human cases and resident wild birds seroprevalence data.
Distance to water and altitude have both been previously shown to be negatively correlated with mosquito larval presence [36]; mosquitoes are the main biological vectors of WNV and transmission of this arthropod-borne virus is highly dependent on the density of mosquitoes. Low lying areas in close proximity to water include wetland habitats that are used as resting and breeding areas for various migratory and resident birds, allowing the long-distance introduction of the virus via migration routes as well as the rapid local amplification of the virus in a mosquito-bird cycle. In this study, apart from statistically identifying proximity to water and altitude as risk factors of spread of WNV in Greece, we were able to determine specific mean values for these habitat variables that allowed us to predict areas at high-risk for further disease incursion.
WNV positive birds are considered important environmental predictors of WNV human risk and are used in surveillance and risk assessment [2,9,37]. Whilst viremic birds are likely to represent the highest risk to humans, the viremic phase is extremely short, restricting data richness and thus statistical power. Hence we focused our analysis on longer lived serological measures of exposure. Moreover, the use of only the resident WNV seropositive wild birds from all hunter-harvested samples available, even though samples from migratory birds were also found positive, increased the reliability of our analysis, avoiding biases regarding area of exposure e.g. migratory birds travel long distances so the origin of exposure is hard to be determined. Hence, this is a good example of a case in which surveillance regarding exposure and other similar biological data derived from nature, regarding a zoonosis, can be used as an indicator for predicting high-risk areas. This fact was confirmed by the good fit that our model showed for the 2012 positive WNV human cases in Greece.

Conclusions
Modelling results indicated that positive resident wild bird occurrences are correlated with human WNV risk and can facilitate the assessment of environmental variables that contribute to that risk, recognising new high-risk areas where the disease could further spread. Our approach allowed us to create a risk based mapping system to assist and guide WNV disease surveillance, monitoring and control. This risk based approach offers a way to stratify surveillance efforts and resources to improve the efficiency of surveillance for new outbreaks and monitoring existing outbreaks. Furthermore, it could proactively enhance other preventive efforts and educational campaigns for the general public in the not yet ''affected'' areas. Most importantly, early warning and identification of outbreaks is critical to limiting the animal and human losses to this disease. An active surveillance program undertaken on resident wild birds could be added to active and passive surveillance focused on humans, horses and mosquitoes greatly helping in evaluating and dealing with future outbreaks linked to flaviviruses.