Table 1.
Relationship between observed versus predicted values of WNV vector prevalence under random forest models using the same data set for testing and training, and under models where ∼36% of the data was separated from training and used as test data (36).
Figure 1.
Variable importance scores under random forest models for both ecological and economic predictors of West Nile virus prevalence in a West Nile virus hotspot.
Percent mean square error indicates the increase in error in out-of-bag samples when that variable is permuted, with higher increases indicative of more important variables. Negative changes in mean square error percentage (2004) suggest that random permutations of a variable perform better under random forest than actual values, indicating a poor predictor. There were not enough West Nile virus positives in vectors for years 2006 and 2007; thus, these years were excluded from analyses.
Figure 2.
Relationship between average per capita income and West Nile virus prevalence.
Results are shown for vectors in Orange County, California, for 2004, 2005, and 2008. Prevalence is measured as MLE. Dashed lines indicate the bifurcation between high and low prevalence values as determined by tree regressions. Horizontal lines indicate mean values of prevalence for points above and below this bifurcation (Wilcoxon rank-sum tests for these means were significant for each year, p<0.001). Although absolute measures of WNV prevalence varied between years, relationships between predictors (per capita income in this case) and WNV prevalence were stable throughout the study period.
Figure 3.
Spatial predictions of West Nile virus in vectors and human populations.
(A) Data layer representing per capita income across the county, as collected as part of the 2000 U.S. National Census. (B) Predictions of WNV prevalence in vectors across the study area for 2008 based on the 2005 WNV prevalence model. Circles indicate observed WNV prevalence levels in 2008 using the same color codes. (C) Predictions of WNV presence in human hosts in 2008 across the study area, determined using niche modeling (Maxent; 25). Scale bar is an approximation, as scale varies according to perspective.