Better null models for assessing predictive accuracy of disease models

Alexander C. Keyel; A. Marm Kilpatrick

doi:10.1371/journal.pone.0285215

Abstract

Null models provide a critical baseline for the evaluation of predictive disease models. Many studies consider only the grand mean null model (i.e. R²) when evaluating the predictive ability of a model, which is insufficient to convey the predictive power of a model. We evaluated ten null models for human cases of West Nile virus (WNV), a zoonotic mosquito-borne disease introduced to the United States in 1999. The Negative Binomial, Historical (i.e. using previous cases to predict future cases) and Always Absent null models were the strongest overall, and the majority of null models significantly outperformed the grand mean. The length of the training timeseries increased the performance of most null models in US counties where WNV cases were frequent, but improvements were similar for most null models, so relative scores remained unchanged. We argue that a combination of null models is needed to evaluate the forecasting performance of predictive models for infectious diseases and the grand mean is the lowest bar.

Citation: Keyel AC, Kilpatrick AM (2023) Better null models for assessing predictive accuracy of disease models. PLoS ONE 18(5): e0285215. https://doi.org/10.1371/journal.pone.0285215

Editor: Ali R. Ansari, Gulf University for Science and Technology, KUWAIT

Received: December 2, 2022; Accepted: April 17, 2023; Published: May 5, 2023

Copyright: © 2023 Keyel, Kilpatrick. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The human neuroinvasive West Nile virus case data set used here is maintained by the Centers for Disease Control Division of Vector-borne Disease (dvbid2@cdc.gov) and is available to qualified researchers, subject to a data use agreement. County population data were obtained from the US Census Bureau and are available in a ready-to-use format in the census.data object from www.github.com/akeyel/wnvdata.

Funding: This publication was supported by cooperative agreement 1U01CK000509-01, funded by the Centers for Disease Control and Prevention and by the National Institutes of Health grant R01AI168097 [ACK] and National Science Foundation grants DEB 1911853, DEB-1717498 and CNH-1115069 [AMK]. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the Centers for Disease Control and Prevention or the Department of Health and Human Services. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Forecasting infectious disease dynamics is a key challenge for the 21^st century [1]. Climate and land use change, combined with the introduction of pathogens to new regions, has created an urgent need for predicting future disease threats [2]. Large data sets and new modeling and statistical techniques have opened up possibilities for ecological forecasting [3]. A key step in the evaluation of predictive models is assessing their improvement over null models. The use of null models to provide a baseline in the absence of specific mechanisms has a long history in ecology [4]. Such baselines are important, as in some cases, predictive models may appear to be informative, but may be no better than a simple and uninformative null model [5, 6]. For example, when dealing with rare events, if a predictive model is outperformed by a null model that predicts the event to never occur, it is not providing much useful information about the process being studied [5].

West Nile Virus (WNV) is an excellent system in which to examine null models in a probabilistic context. WNV is a flavivirus that cycles between mosquito and avian populations [7–9]. WNV was introduced to the United States (US) in 1999 [10] and rapidly spread to the conterminous US and throughout the Americas [11]. As a nationally notifiable disease in the US, long-term data sets (>20 years) exist on human cases [12]. Many models have been built for predicting WNV risk [13] including mechanistic models based on climate and vector data sets [e.g., 14, 15]. Most studies of WNV, and many other pathogens, have included only a very simplistic null model (e.g. R², which uses the grand mean of the training data) for assessment of model accuracy.

Our aim was to examine a range of null models (Table 1) to provide guidance on null model selection and performance in disease forecasting for locations with frequent (≥50% of years with disease) and infrequent cases (disease present, but <50% of years). We tested 10 null models using the number of WNV cases in each county in the US in each year in a probabilistic framework. Where cases were frequent and timeseries were long, we hypothesized the Negative Binomial model would perform the best due to its ability to model count distributions with a variable rate parameter. Where cases were infrequent and time series were short, we predicted that no models would significantly outperform the Always Absent mode.

Download:

Table 1. A short description of the ten null models examined.

https://doi.org/10.1371/journal.pone.0285215.t001

Material and methods

Data set

We compared the accuracy of 10 null models using the CDC Neuroinvasive WNV Case records (Source: ArboNET, Arboviral Diseases Branch, Centers for Disease Control and Prevention, Fort Collins, Colorado; contact the CDC for data access). This is a national data set of the number of WNV neuroinvasive disease cases in each county in each year from 3108 counties in the conterminous United States (US) from 2000–2021. We used WNV neuroinvasive cases, because there is less variability across different states in detection of these cases compared to WNV fever cases.

We divided the data set into two groups: 159 counties that have had 11 or more years with WNV (frequent WNV set), and 1880 counties that had 1–10 years with WNV (infrequent WNV set). The 1069 counties that never had WNV cases were excluded to avoid zero-inflation. The first year a state reported a case of WNV (per [9]) was used as the first year of training data for all counties within that state. As a result, the number of counties included in the analysis increased over time (Table 2). Model predictions were made using at least 4 years of training data. We used the Continuous Ranked Probability Score, a probabilistic scoring approach that can evaluate a distribution of predicted outcomes (Fig 1). Population data for each year for the incidence-based null models came from the United States Census Bureau [19, 20]. Population data from 2019 were used for 2020 and 2021 as well, due to missing data for these years.

Download:

Fig 1. A quick introduction to probabilistic models.

https://doi.org/10.1371/journal.pone.0285215.g001

Download:

Table 2. Number of counties used for the null model analysis by prediction year for 2004–2021.

After 2008, the sample size remained constant for all following years as all states had the minimum 4 years of training data with WNV by that point.

https://doi.org/10.1371/journal.pone.0285215.t002

We also tested whether our model results were sensitive to the length of time series for selected models. Model years were selected at random (without replacement) to use as training data to predict a randomly selected focal year. This allowed us to disentangle length of time series from the specific order of observation of results. However, models that required a temporal structure were excluded from this analysis (i.e. Prior Year and Autoregressive). The Incidence and Pooled Incidence models were also excluded from this analysis as they used the prior-year’s population for converting incidence to case counts. Only data from 2005 and later were used in this analysis to ensure that WNV had already been established in all counties.

Null models

The ten null models are described in Table 1. Note that using case counts versus incidence does not make a large difference to the outcome when stratifying by county for the mean value model, incidence model, prior-year model, or historical null model, because population is relatively consistent from one year to another, and therefore counts and incidence can be interconverted by multiplying or dividing by population. However, the choice of incidence or case counts as the model basis does lead to different outcomes when pooling across counties with different population sizes, or when working with count-based models such as the negative binomial.

Scoring method

We used the Continuous Ranked Probability Score (CRPS), which is a proper scoring method [21, 22]. A proper scoring method is one that correctly assigns a better score to a better model in the long run. We chose the CRPS rather than the Logarithmic Score because the former scores forecasts based on the distance from each predicted probability to the observation, whereas the latter only scores whether an observation is within a bin or outside of a bin, with no consideration of how far outside the bin the prediction was [23–25]. As the CRPS scores are based on distance from the observed value, the models above were allowed to predict fractional cases of WNV (e.g. if the mean number of cases was 2.5 cases, that would be used as the prediction rather than rounding up or down to the nearest whole number). For null models that required sampling from a probabilistic distribution, we used 100 random draws. Data analyses were performed in R [26]. Code for running the null models is available via the probnulls package on GitHub (www.github.com/akeyel/probnulls/R/NullModels.R).

Length of training time series

We examined how the length of the time series for training each null model affected its prediction using the mean of 10 CRPS scores of each of six null models (Always Absent, Pooled Mean Value, Mean Value, Historical, Uniform, and Negative Binomial) for each of 13 different training time series lengths (5 to 17 years). We regressed the mean CRPS score against the training time series length and included the null model as an additional predictor. We compared additive and interaction models of training time series length and null model by AIC. We performed the analysis separately for high and low incidence counties. We show the slopes and statistics for each null model using the lstrends() function from the emmeans package in R.

Results

The Mean Value null model (R²) that is frequently used as the baseline for prediction accuracy was among the weakest of the null models (Fig 2). It performed worse than 5 null models (significantly worse than 4; Fig 2) for frequent WNV counties and worse than 6 null models (significantly worse than 5) for infrequent WNV counties (Fig 2). In contrast, the Negative Binomial null model was significantly better than other null models for predicting neuroinvasive cases of WNV in the frequent WNV analysis (Fig 2A, paired t-tests using a Holm correction for multiple comparisons [27]). The Negative Binomial was also significantly better than eight other null models (all except the Always Absent model which was equally accurate) in the infrequent WNV analysis (Fig 2B). The Negative Binomial was the top model in 8 individual years (out of 18) for the frequent WNV analysis, and in 9 of 18 years for the infrequent WNV analysis (Table 3). The Historical Null model also performed very well in both frequent and infrequent WNV analyses across all years (Fig 2), and outperformed all other models in 5 individual years for frequent WNV counties (Table 3). Finally, in the infrequent WNV analysis, the Always Absent model was tied for the best model for all years combined, and outperformed seven other models (Fig 2B) and was the best model for 8 individual years (Table 3).

Download:

Fig 2.

Continuous Ranked Probability Scores (CRPS) for 2004–2021 for 10 null models for a) counties with at least 50% of the time series with WNV (“Frequent”) and b) for counties with at least one case, but cases < 50% of the time series (“Infrequent”). The mean CRPS scores was calculated across all counties for each model and year, and the plot shows the median value of these mean annual CRPS scores by model, with the box showing the 25% and 75% quartiles, whiskers corresponding to +/- 1.5 times the Interquartile range, and circles corresponding to values outliers outside this range. Different letters indicate significant differences between models at α = 0.05 after a sequential Bonferroni correction for multiple comparisons [27].

https://doi.org/10.1371/journal.pone.0285215.g002

Download:

Table 3. Frequency of each model having the lowest CRPS score for a year in counties where WNV is frequent (>50% of time series) or infrequent (present < 50% of the time series).

https://doi.org/10.1371/journal.pone.0285215.t003

The length of the training time series had only weak effects on null model performance (Fig 3). For frequent WNV counties, model score improved significantly with the length of the training time series for four of the six models examined, but the effect was similar for all four models (Table 4). For the two remaining models, increasing the length of the training time series had a non-significant improvement in model score (Pooled Mean Value) or actually made the score worse (Uniform) (Table 5). For infrequent WNV counties mean CRPS null model score did not improve significantly with the length of the training time series for any model, but got significantly worse for the Uniform Null (Table 5). Thus, except for the Uniform Null, the relative rankings of null models were the same across the full range of time series lengths examined (5 to 17 years; Fig 3).

Download:

Fig 3. The Negative Binomial and Historical were generally the top two models (lower CRPS scores correspond to a more accurate model), independent of length of time series used to train the models for both a) the counties with frequent WNV and b) the counties with infrequent WNV cases.

Training years were randomly selected from the entire time series, and a random focal year was selected for evaluation. Only a subset of null models was evaluated over time. Shading indicates a 95% confidence interval for the estimated mean. AA: Always Absent, HN: Historical Null, MV: Mean Value, NB: Negative Binomial, PV: Pooled Value, UN: Uniform.

https://doi.org/10.1371/journal.pone.0285215.g003

Download:

Table 4. Analysis of the length of the training time series on the mean CRPS score for six null models for frequent WNV counties (Fig 3A).

A model with an interaction between null model and time series length had more support than an additive model (ΔAIC = 21, see S1 Table in S1 File for detailed parameter estimates). The table shows the statistics for the slopes for each model (not differences between slopes).

https://doi.org/10.1371/journal.pone.0285215.t004

Download:

Table 5. Analysis of the length of the training time series on the mean CRPS score for six null models for infrequent WNV counties (Fig 3B).

A model with an interaction between null model and time series length had more support than an additive model (ΔAIC = 50, see S2 Table in S1 File for detailed parameter estimates). The table shows the statistics for the slopes for each model (not differences between slopes).

https://doi.org/10.1371/journal.pone.0285215.t005

Discussion

At least five null models significantly outperformed a county-based grand mean and many did far better (Figs 2, 3). A grand mean calculated across all included counties (Pooled Mean model) performed even worse. Thus, when evaluating the performance of new statistical or mechanistic models of disease incidence, there are far better null models than the grand mean (i.e. R²). These null models can be easily calculated for time-series data (e.g., using the probnulls package from GitHub in R), and our results suggest that the length of time series was not critical for developing a robust null model across a range of 4–16 years. The Negative Binomial and Historical nulls were the strongest null models overall (Fig 2), with the Always Absent null performing well where disease cases were infrequent. The strong performance of the Always Absent null in regions where WNV was infrequent (statistically tied with Negative Binomial, Fig 2; top model in 8 of 18 years, Table 3) is a reminder that basic accuracy statistics for rare events can appear high.

The structure and scale of the underlying data may affect the performance of the different null models. The WNV data set here does not have a clear temporal trend. A strong temporal trend would likely have changed which model performed the best. Specifically, null models that use the recent past to predict future cases (e.g. autoregressive models) would perform much better. Seasonal patterns, as examined in recent dengue forecasts [1], could also affect which null model performs best. Future work could explore the performance of different models under different magnitudes of temporal trend and stochastic variation. Many (34%) of counties in the US did not have a neuroinvasive case within the study period. For risk estimates for these counties, fitting models on groups of counties may be necessary [e.g., as in 28]. Additionally, county-annual scales may be more relevant to academic study than to vector control and public health responses [29]. Research on null model performance is needed at finer spatial and temporal scales.

Broadly, null models are seeing increased use in the infectious disease modeling literature. A uniform model and a SARIMA model were used to predict dengue cases as part of a forecasting challenge in Puerto Rico [1]. A random walk and a probabilistic prior-week model were used as null models for forecasting COVID-19 deaths [30], and a modification of a simple AR(1) model was found to perform well for predicting COVID-19 hospitalizations [31, 32].

Conclusion

We strongly recommend the inclusion of multiple null models when testing predictive models of vector-borne diseases. A grand mean calculated from the training data set is an inadequate null model given the suite of probabilistic alternatives available. The Negative Binomial and Historical nulls performed especially well for WNV and simple autoregressive models performed moderately well and would likely perform even better for data with temporal trends. Negative Binomial and Historical null models performed well both when WNV cases were frequent and when they were infrequent, and their relative performance did not depend on the length of the training time series. Researchers proposing mechanistic models should determine if their models are an improvement over a simple statistical description of historical patterns.

Supporting information

S1 File. Two tables containing full parameter details for the time series length analysis for counties with frequent (S1 Table) and infrequent (S2 Table) WNV cases.

https://doi.org/10.1371/journal.pone.0285215.s001

(DOCX)

Acknowledgments

We thank L. F. Chaves for constructive discussion.

References

1. Johansson MA, Apfeldorf KM, Dobson S, Devita J, Buczak AL, Baugher B, et al. An open challenge to advance probabilistic forecasting for dengue epidemics. Proceedings of the National Academy of Sciences. 2019;116: 24268–24274. pmid:31712420
- View Article
- PubMed/NCBI
- Google Scholar
2. Kilpatrick AM, Randolph SE. Drivers, dynamics, and control of emerging vector-borne zoonotic diseases. LANCET. 2012;380: 1946–1955. pmid:23200503
- View Article
- PubMed/NCBI
- Google Scholar
3. Dietze M. Ecological Forecasting. Princeton University Press; 2017.
4. Gotelli NJ, Graves GR. Null models in ecology. 1996.
- View Article
- Google Scholar
5. Olden JD, Jackson DA, Peres-Neto PR. Predictive Models of Fish Species Distributions: A Note on Proper Validation and Chance Predictions. null. 2002;131: 329–336.
- View Article
- Google Scholar
6. Beale CM, Lennon JJ, Gimona A. Opening the climate envelope reveals no macroscale associations with climate in European birds. Proceedings of the National Academy of Sciences. 2008;105: 14908–14912. pmid:18815364
- View Article
- PubMed/NCBI
- Google Scholar
7. Work TH, Hurlbut HS, Taylor R. Indigenous Wild Birds of the Nile Delta as Potential West Nile Virus Circulating Reservoirs. The American Journal of Tropical Medicine and Hygiene. 1955;4: 872–888. pmid:13259011
- View Article
- PubMed/NCBI
- Google Scholar
8. Komar N, Langevin S, Hinten S, Nemeth N, Edwards E, Hettler D, et al. Experimental infection of North American birds with the New York 1999 strain of West Nile virus. Emerging infectious diseases. 2003;9: 311. pmid:12643825
- View Article
- PubMed/NCBI
- Google Scholar
9. Kilpatrick AM. Globalization, land use, and the invasion of West Nile virus. Science. 2011;334: 323–327. pmid:22021850
- View Article
- PubMed/NCBI
- Google Scholar
10. Lanciotti RS, Roehrig JT, Deubel V, Smith J, Parker M, Steele K, et al. Origin of the West Nile virus responsible for an outbreak of encephalitis in the northeastern United States. Science. 1999;286: 2333–2337. pmid:10600742
- View Article
- PubMed/NCBI
- Google Scholar
11. Kramer LD, Ciota AT, Kilpatrick AM. Introduction, Spread, and Establishment of West Nile Virus in the Americas. Journal of Medical Entomology. 2019;56: 1448–1455. pmid:31549719
- View Article
- PubMed/NCBI
- Google Scholar
12. CDC. Nationally notifiable arboviral diseases reported to ArboNET: Data release guidelines. Centers for Disease Control and Prevention; 2019.
13. Barker CM. Models and Surveillance Systems to Detect and Predict West Nile Virus Outbreaks. J Med Entomol. 2019;56: 1508–1515. pmid:31549727
- View Article
- PubMed/NCBI
- Google Scholar
14. Davis JK, Vincent GP, Hildreth MB, Kightlinger L, Carlson C, Wimberly MC. Improving the prediction of arbovirus outbreaks: A comparison of climate-driven models for West Nile virus in an endemic region of the United States. Acta Trop. 2018;185: 242–250. pmid:29727611
- View Article
- PubMed/NCBI
- Google Scholar
15. DeFelice NB, Schneider ZD, Little E, Barker C, Caillouet KA, Campbell SR, et al. Use of temperature to improve West Nile virus forecasts. PLOS Comput Biol. 2018;14. pmid:29522514
- View Article
- PubMed/NCBI
- Google Scholar
16. Smith KH, Tyre AJ, Hamik J, Hayes MJ, Zhou Y, Dai L. Using Climate to Explain and Predict West Nile Virus Risk in Nebraska. GeoHealth. 2020;4: e2020GH000244. pmid:32885112
- View Article
- PubMed/NCBI
- Google Scholar
17. Venables WN, Ripley BD. Modern Applied Statistics with S. Fourth. New York: Springer; 2002. Available: http://www.stats.ox.ac.uk/pub/MASS4/
18. Ripley BD. Time series in R 1.5.0. R News. 2002;2: 2–7.
- View Article
- Google Scholar
19. US Census Bureau. Intercensal estimates of the resident population for counties and states: April 1, 2000 to July 1, 2010. Suitland, MD: US Census Bureau. Retreived from: https://www.census.gov/data/datasets/time-series/demo/popest/intercensal-2000-2010-counties.html. 2017.
20. US Census Bureau. Population, Population Change, and Estimated Components of Population Change: April 1, 2010 to July 1, 2019 (CO-EST2019-alldata). Suitland, MD: US Census Bureau. Retreived from: https://www.census.gov/data/tables/time-series/demo/popest/2010s-counties-total.html. 2019.
21. Jordan A, Krüger F, Lerch S. Evaluating Probabilistic Forecasts with scoringRules. Journal of Statistical Software. 2019;90: 1–37.
- View Article
- Google Scholar
22. Bracher J, Ray EL, Gneiting T, Reich NG. Evaluating epidemic forecasts in an interval format. PLOS Computational Biology. 2021;17: e1008618. pmid:33577550
- View Article
- PubMed/NCBI
- Google Scholar
23. Matheson JE, Winkler RL. Scoring rules for continuous probability distributions. Management Science. 1976;22: 1087–1096.
- View Article
- Google Scholar
24. Hersbach H. Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather and Forecasting. 2000;15: 559–570.
- View Article
- Google Scholar
25. Wilks DS. Statistical Methods in the Atmospheric Sciences. Academic Press; 2011.
26. R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2017. Available: https://www.R-project.org/
27. Holm S. A Simple Sequentially Rejective Multiple Test Procedure. Scandinavian Journal of Statistics. 1979;6: 65–70.
- View Article
- Google Scholar
28. Keyel AC. Patterns of West Nile virus in the Northeastern United States using negative binomial and mechanistic trait-based models. medRxiv. 2022; 2022.11.09.22282143.
- View Article
- Google Scholar
29. Keyel AC, Gorris ME, Rochlin I, Uelmen JA, Chaves LF, Hamer GL, et al. A proposed framework for the development and qualitative evaluation of West Nile virus models and their application to local public health decision-making. PLOS Neglected Tropical Diseases. 2021;15: e0009653. pmid:34499656
- View Article
- PubMed/NCBI
- Google Scholar
30. Cramer EY, Ray EL, Lopez VK, Bracher J, Brennen A, Castro Rivadeneira AJ, et al. Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States. Proceedings of the National Academy of Sciences. 2022;119: e2113561119. pmid:35394862
- View Article
- PubMed/NCBI
- Google Scholar
31. Olshen AB, Garcia A, Kapphahn KI, Weng Y, Vargo J, Pugliese JA, et al. COVIDNearTerm: A simple method to forecast COVID-19 hospitalizations. Journal of Clinical and Translational Science. 2022/04/19 ed. 2022;6: e59. pmid:35720970
- View Article
- PubMed/NCBI
- Google Scholar
32. White LA, McCorvie R, Crow D, Jain S, León T M. Assessing the accuracy of California county level COVID-19 hospitalization forecasts to inform public policy decision making. medRxiv. 2022; 2022.11.08.22282086.
- View Article
- Google Scholar

[ref1] 1. Johansson MA, Apfeldorf KM, Dobson S, Devita J, Buczak AL, Baugher B, et al. An open challenge to advance probabilistic forecasting for dengue epidemics. Proceedings of the National Academy of Sciences. 2019;116: 24268–24274. pmid:31712420
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Kilpatrick AM, Randolph SE. Drivers, dynamics, and control of emerging vector-borne zoonotic diseases. LANCET. 2012;380: 1946–1955. pmid:23200503
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Dietze M. Ecological Forecasting. Princeton University Press; 2017.

[ref4] 4. Gotelli NJ, Graves GR. Null models in ecology. 1996.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref5] 5. Olden JD, Jackson DA, Peres-Neto PR. Predictive Models of Fish Species Distributions: A Note on Proper Validation and Chance Predictions. null. 2002;131: 329–336.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. Beale CM, Lennon JJ, Gimona A. Opening the climate envelope reveals no macroscale associations with climate in European birds. Proceedings of the National Academy of Sciences. 2008;105: 14908–14912. pmid:18815364
View Article
PubMed/NCBI
Google Scholar

[17] View Article

[18] PubMed/NCBI

[19] Google Scholar

[ref7] 7. Work TH, Hurlbut HS, Taylor R. Indigenous Wild Birds of the Nile Delta as Potential West Nile Virus Circulating Reservoirs. The American Journal of Tropical Medicine and Hygiene. 1955;4: 872–888. pmid:13259011
View Article
PubMed/NCBI
Google Scholar

[21] View Article

[22] PubMed/NCBI

[23] Google Scholar

[ref8] 8. Komar N, Langevin S, Hinten S, Nemeth N, Edwards E, Hettler D, et al. Experimental infection of North American birds with the New York 1999 strain of West Nile virus. Emerging infectious diseases. 2003;9: 311. pmid:12643825
View Article
PubMed/NCBI
Google Scholar

[25] View Article

[26] PubMed/NCBI

[27] Google Scholar

[ref9] 9. Kilpatrick AM. Globalization, land use, and the invasion of West Nile virus. Science. 2011;334: 323–327. pmid:22021850
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref10] 10. Lanciotti RS, Roehrig JT, Deubel V, Smith J, Parker M, Steele K, et al. Origin of the West Nile virus responsible for an outbreak of encephalitis in the northeastern United States. Science. 1999;286: 2333–2337. pmid:10600742
View Article
PubMed/NCBI
Google Scholar

[33] View Article

[34] PubMed/NCBI

[35] Google Scholar

[ref11] 11. Kramer LD, Ciota AT, Kilpatrick AM. Introduction, Spread, and Establishment of West Nile Virus in the Americas. Journal of Medical Entomology. 2019;56: 1448–1455. pmid:31549719
View Article
PubMed/NCBI
Google Scholar

[37] View Article

[38] PubMed/NCBI

[39] Google Scholar

[ref12] 12. CDC. Nationally notifiable arboviral diseases reported to ArboNET: Data release guidelines. Centers for Disease Control and Prevention; 2019.

[ref13] 13. Barker CM. Models and Surveillance Systems to Detect and Predict West Nile Virus Outbreaks. J Med Entomol. 2019;56: 1508–1515. pmid:31549727
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref14] 14. Davis JK, Vincent GP, Hildreth MB, Kightlinger L, Carlson C, Wimberly MC. Improving the prediction of arbovirus outbreaks: A comparison of climate-driven models for West Nile virus in an endemic region of the United States. Acta Trop. 2018;185: 242–250. pmid:29727611
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref15] 15. DeFelice NB, Schneider ZD, Little E, Barker C, Caillouet KA, Campbell SR, et al. Use of temperature to improve West Nile virus forecasts. PLOS Comput Biol. 2018;14. pmid:29522514
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref16] 16. Smith KH, Tyre AJ, Hamik J, Hayes MJ, Zhou Y, Dai L. Using Climate to Explain and Predict West Nile Virus Risk in Nebraska. GeoHealth. 2020;4: e2020GH000244. pmid:32885112
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref17] 17. Venables WN, Ripley BD. Modern Applied Statistics with S. Fourth. New York: Springer; 2002. Available: http://www.stats.ox.ac.uk/pub/MASS4/

[ref18] 18. Ripley BD. Time series in R 1.5.0. R News. 2002;2: 2–7.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref19] 19. US Census Bureau. Intercensal estimates of the resident population for counties and states: April 1, 2000 to July 1, 2010. Suitland, MD: US Census Bureau. Retreived from: https://www.census.gov/data/datasets/time-series/demo/popest/intercensal-2000-2010-counties.html. 2017.

[ref20] 20. US Census Bureau. Population, Population Change, and Estimated Components of Population Change: April 1, 2010 to July 1, 2019 (CO-EST2019-alldata). Suitland, MD: US Census Bureau. Retreived from: https://www.census.gov/data/tables/time-series/demo/popest/2010s-counties-total.html. 2019.

[ref21] 21. Jordan A, Krüger F, Lerch S. Evaluating Probabilistic Forecasts with scoringRules. Journal of Statistical Software. 2019;90: 1–37.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref22] 22. Bracher J, Ray EL, Gneiting T, Reich NG. Evaluating epidemic forecasts in an interval format. PLOS Computational Biology. 2021;17: e1008618. pmid:33577550
View Article
PubMed/NCBI
Google Scholar

[67] View Article

[68] PubMed/NCBI

[69] Google Scholar

[ref23] 23. Matheson JE, Winkler RL. Scoring rules for continuous probability distributions. Management Science. 1976;22: 1087–1096.
View Article
Google Scholar

[71] View Article

[72] Google Scholar

[ref24] 24. Hersbach H. Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather and Forecasting. 2000;15: 559–570.
View Article
Google Scholar

[74] View Article

[75] Google Scholar

[ref25] 25. Wilks DS. Statistical Methods in the Atmospheric Sciences. Academic Press; 2011.

[ref26] 26. R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2017. Available: https://www.R-project.org/

[ref27] 27. Holm S. A Simple Sequentially Rejective Multiple Test Procedure. Scandinavian Journal of Statistics. 1979;6: 65–70.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref28] 28. Keyel AC. Patterns of West Nile virus in the Northeastern United States using negative binomial and mechanistic trait-based models. medRxiv. 2022; 2022.11.09.22282143.
View Article
Google Scholar

[82] View Article

[83] Google Scholar

[ref29] 29. Keyel AC, Gorris ME, Rochlin I, Uelmen JA, Chaves LF, Hamer GL, et al. A proposed framework for the development and qualitative evaluation of West Nile virus models and their application to local public health decision-making. PLOS Neglected Tropical Diseases. 2021;15: e0009653. pmid:34499656
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref30] 30. Cramer EY, Ray EL, Lopez VK, Bracher J, Brennen A, Castro Rivadeneira AJ, et al. Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States. Proceedings of the National Academy of Sciences. 2022;119: e2113561119. pmid:35394862
View Article
PubMed/NCBI
Google Scholar

[89] View Article

[90] PubMed/NCBI

[91] Google Scholar

[ref31] 31. Olshen AB, Garcia A, Kapphahn KI, Weng Y, Vargo J, Pugliese JA, et al. COVIDNearTerm: A simple method to forecast COVID-19 hospitalizations. Journal of Clinical and Translational Science. 2022/04/19 ed. 2022;6: e59. pmid:35720970
View Article
PubMed/NCBI
Google Scholar

[93] View Article

[94] PubMed/NCBI

[95] Google Scholar

[ref32] 32. White LA, McCorvie R, Crow D, Jain S, León T M. Assessing the accuracy of California county level COVID-19 hospitalization forecasts to inform public policy decision making. medRxiv. 2022; 2022.11.08.22282086.
View Article
Google Scholar

[97] View Article

[98] Google Scholar

Figures

Abstract

Introduction

Material and methods

Data set

Null models

Scoring method

Length of training time series

Results

Discussion

Conclusion

Supporting information

S1 File. Two tables containing full parameter details for the time series length analysis for counties with frequent (S1 Table) and infrequent (S2 Table) WNV cases.

Acknowledgments

References