Prediction of dengue annual incidence using seasonal climate variability in Bangladesh between 2000 and 2018

The incidence of dengue has increased rapidly in Bangladesh since 2010 with an outbreak in 2018 reaching a historically high number of cases, 10,148. A better understanding of the effects of climate variability before dengue season on the increasing incidence of dengue in Bangladesh can enable early warning of future outbreaks. We developed a generalized linear model to predict the number of annual dengue cases based on monthly minimum temperature, rainfall and sunshine prior to dengue season. Variable selection and leave-one-out cross-validation were performed to identify the best prediction model and to evaluate the model’s performance. Our model successfully predicted the largest outbreak in 2018, with 10,077 cases (95% CI: [9,912–10,276]), in addition to smaller outbreaks in five different years (2003, 2006, 2010, 2012 and 2014) and successfully identified the increasing trend in cases between 2010 and 2018. We found that temperature was positively associated with the annual incidence during the late winter months (between January and March) but negatively associated during the early summer (between April and June). Our results might be suggest an optimal minimum temperature for mosquito growth of 21–23°C. This study has implications for understanding how climate variability has affected recent dengue expansion in neighbours of Bangladesh (such as northern India and Southeast Asia).


Background
Dengue fever, one of the most prevalent vector-borne diseases, has led to significant socio-economic costs in many parts of the world [1]. Three-quarters of the global dengue cases occur in Southeast Asian and western Pacific countries, due to the associated favorable weather conditions for mosquito population expansion [2,3]. Outbreaks of dengue fever can significantly reduce life expectancy due to the possibility of developing severe dengue following secondary infections from different dengue serotypes [4]. Therefore, it is critically important to understand the impacts of the climate on the spread of dengue in these regions, as this can serve as early warning system and enable early preventative measures to be put in place before outbreaks become established.
Expansion of dengue in the regions surrounding northern India may have occurred in recent years due to climate change. The prolonged rainy seasons and increasing temperatures in subtropical regions of Southeast Asia may provide favorable conditions for expansion of Aedes mosquito populations, the dengue vector [5][6][7][8]. In addition, increased incidence of dengue has recently been observed in more temperate regions [9], such as Nepal [10], indicating a possible expansion of the disease from the subtropics to cooler climates, posing a threat to northern India, Pakistan and their neighbors. Bangladesh is located to the northeast of India and to the south of Nepal, and lies along the Tropic of Cancer. Understanding the patterns of recent dengue outbreaks in Bangladesh may provide greater insight into whether dengue has expanded into the region surrounding northern India, a region with more than 140 million inhabitants.
Dengue fever was first identified in Bangladesh in 1964 [11] and was not initially considered to be a severe threat to public health. However, in 2000, an outbreak occurred, leading to a total of 5,551 reported cases and 93 confirmed deaths [12,13]. The average annual number of dengue cases decreased between 2000 and 2010. However, since then the number of annual dengue cases in Bangladesh has been increasing rapidly. A recent outbreak in 2019 was the largest ever experienced by the country, whereas the second largest outbreak was seen only a year prior, in 2018. Whether or how climate variability may have driven this unprecedented rise in outbreak size in 2018 and 2019 is still largely unknown [14].
Several studies have been carried out to estimate dengue incidence in Bangladesh using climate data prior to 2010 [15][16][17]; however, the driving factors responsible for the increasing disease burden since 2010 remain to be investigated. In these previous studies, temperature and rainfall were found to be significant contributing factors [18][19][20][21]. Previous studies also assumed that the effects of climate variables are independent of the time of year. However, the effects of climate variables can also be time-dependent. Several studies have demonstrated that the effects of rainfall on dengue incidence can vary throughout the year [22,23]. The abundant rainfall that occurs during monsoon season is likely to have negative effects on mosquito population size, as the rain can disrupt potential mosquito habitats. In contrast, rainfall in winter months may result in stagnant bodies of water suitable for mosquito breeding.
Recent studies have mentioned the dengue incidence and mosquito abundance can be affected by weather conditions up to 5 months before the season starts [22][23][24] using data in or near subtropical areas. However, most of the studies focus on climate factors during dengue season. A better understanding of the relationship between climate variability before dengue season and annual incidence can provide insight into whether dengue has been expanding in a region and allow early warning system to be built.
This study aimed to estimate the effects of climate factors before dengue season on annual incidence in Bangladesh using historical data from 2000 to 2018. We developed a generalized linear model to predict annual dengue cases based on monthly temperature, rainfall and sunshine. We demonstrated that temperature and rainfall have variable effects on dengue incidence depending on the time of year and suggest an ideal temperature range for mosquito population growth based on our findings.

Study location
Bangladesh is a Southeast Asian country, as defined by the World Health Organization (WHO) [25]. India surrounds it to the east, west and north, and Myanmar borders it to the southeast (Fig 1). The Bay of Bengal is located in the south. Bangladesh is located at 20˚59 0 N to 26˚63 0 N and 88˚03 0 E to 92˚67 0 E. The Tropic of Cancer line is located at 23˚26 0 N and 88˚47 0 E, where it crosses Bangladesh from east to west [26].
Bangladesh is located in both tropical and subtropical climate regions. The seasons in Bangladesh can be broadly characterized as summer (March-June; mostly hot and humid), monsoon (June-October; warm and rainy) and winter (October-March; cold and dry). However, March can also be described as the spring, and the duration between mid-October and mid-November can be called the autumn. The maximum temperature ranges from 30˚C to 40˚C during summer, whereas in winter the average temperature reaches as low as 10˚C in most areas of the country. The average annual rainfall ranges between 1,500 mm and 3,000 mm.

PLOS GLOBAL PUBLIC HEALTH
Approximately 70-80% of the annual rainfall occurs during monsoon season [27]. The shortest period of sunshine, 5.4-5.8 hours per day, also occurs during this season. In contrast, winter and summer have the longest sunshine duration, 8.5-9.1 hours per day [28].

Dengue data
Dengue cases observed in health facilities across the country are generally reported to the Directorate General of Health Services (DGHS), and are classified into suspected, probable and confirmed cases. Individuals having acute febrile illness with or without non-specific signs and symptoms are classified as suspected cases and those having acute febrile illness with serological diagnosis are considered probable cases. The confirmed cases should have an acute febrile illness with positive dengue NS1 antigen or PCR test. Details of dengue case definitions and management are available from the DGHS [29].
The communicable disease control (CDC) unit of the DGHS compiles the reported dengue cases on a daily basis for further circulation. We accessed monthly dengue cases between January 2000 and December 2018 from the DGHS by collaborating with the Institute of Epidemiology, Disease Control and Research.

Climate data
Accumulated weather information was monitored and managed by the Bangladesh Meteorological Department (BMD) at 35 distinct weather stations across the country (location of the stations are available in Fig 1 in [15]). We collected these weather records from the BMD including daily mean, minimum and maximum temperature, total and maximum daily rainfall and daily sunshine duration. Temperature and rainfall are measured in degree Celsius (˚C) and millimeter (mm), respectively, whereas sunshine duration is recorded in hours. Daily information was averaged for each month to obtain monthly information for each station. The national averages for monthly temperature, sunshine duration and rainfall were obtained by averaging the values of all 35 weather stations.

Model formulation
Following the previous approaches, we modeled annual dengue incidence using quasi Poisson regression. In order to deal with the potential overdispersion issue, the quasi Poisson model and corrected quasi Akaike information criteria (QAICc) were used. QAICc has been frequently adopted for estimating the goodness of fit in modeling overdispersed count data in biological or ecological studies using the quasi Poisson regression [30]. Another approach is to adopt a negative binomial model. Hence, we have adopted QAICc in quasi Poisson regression model and also evaluate whether a negative binomial regression model can provide a best prediction result.
We assumed that dengue cases reported in January, February and March were belonging to the previous year's dengue outbreak. Hence, annual dengue incidence is defined as the sum of the number of dengue cases from April to December of a given year and from January to March of the following year. Let y j be the annual dengue cases in the j th year (j = 1, 2, . . ., 19) such that y j * quasi-Poisson(λ j ), where λ j represents the expected number of dengue cases in the j th year, i.e. E(y j ) = λ j . Therefore, the Poisson regression model can be expressed as where β i , η i and γ i represent coefficients of temperature (T), rainfall (R) and sunshine duration (S) in the i th month (i = 1, 2, . . ., 6). Therefore, 15 predictor variables exist in the full model as shown in Eq (1). These predictors were preselected during these months as we aimed to predict dengue outbreaks, which often begin during the early summer. The final model was selected based on QAICc, which was designed to deal with small samples, and the results of cross-validation (details are given in Model selection and validation sections). It is worth to note that this approach has been previously applied in the annual dengue prediction in Asian countries [22,23].

Model selection
We compared six different models for predicting the annual dengue outbreaks with various combinations of climate variables. Monthly temperatures (minimum, average or maximum) and rainfall (maximum or total) from January to June and the sunshine duration from April to June were chosen as potential predictors. Sunshine duration is mainly related to mosquito activities (e.g., mosquito bite) and hence the associated disease transmissibility. As the number of mosquitoes between January and March is low there is no need to consider sunshine duration during this period. Hence, we considered a shorter window for the sunshine duration. On the other hand, temperature and rainfall are involved in mosquito population growth. The change of growth rate during early months (e.g. from January to March) can affect population size later. To avoid overlapping the climate predictors, we considered a single category of temperatures and rainfall in each model. A corrected version of the Akaike information criteria (AIC c ) [31] was used to extract potential predictor variables. The best prediction model was determined using a two-stage selection approach (Fig 2). In the first stage, variable selection was performed using forward stepwise AIC c selection for each of the models. In each step of the stepwise selection, we recorded AIC c along with parameter estimates, mean squared errors for validation and training to assess the fitness of the models (the details of the forward stepwise selection approach are given in the supplementary section). In the final step, each model provides a set of variables with minimum AIC c . For these models, the model with the lowest AIC c was selected as the lowest AIC c model (M L ). In contrast, the models satisfying the condition were defined as low AIC c models (M l ), where � m is the average AIC c of all candidate models. The reason we included both lowest and low AIC c models is that we aim to identify the best prediction model using LOOCV among all the models with low AIC c . In the second stage, the M L and M l models were compared using the mean squared errors obtained from leave-oneout cross-validation (LOOCV). The best model was the model with the minimum MSE Va in LOOCV. In addition, the models were compared using QAIC c of which a lower value referred to a better predictivity of the model.

Model validation
Model validation was conducted using LOOCV to check how accurately the models can predict an independent dataset. If the difference between the predicted and observed value is minimum, we considered the predictive model is good. To perform LOOCV, the data for a specific test year were removed and the model was fitted based on the remaining data, which served as a training set. The fitted model was then used to predict the annual dengue cases for the test year. We repeated this procedure for all years from 2000 to 2018. Mean squared errors for the validation set MSE Va and for the training set MSE Tr were obtained by calculating the difference between predicted and observed numbers of annual dengue cases in the testing set and training set. Next, we checked the mean squared error ratio , the ratio of the mean squared errors of the validation set and the training set.
We computed bootstrap confidence intervals for the predicted annual dengue cases in each year. To do this, we simulated 1,000 random samples from a Poisson distribution by considering LOOCV-estimated annual cases (l) as the parameter of the distribution. The random numbers were used to refit the model 1,000 times, giving the distribution of the estimated parameters. The lower and upper bounds of the 95% confidence intervals were calculated based on the 2.5% and 97.5% quantiles of the parameter distributions.

Assessment of the effects of climate factors
Interpretation of the estimated model coefficients for a generalized linear model is not as straightforward as it is for an ordinary linear regression model, as the dependent variable y is associated with a link function, such as a Poisson link [32]. Therefore, we calculated the marginal effect at the mean (MEM) to understand the effect of each of the predictor variables separately using the R-package ggeffect [33].

Dengue cases in Bangladesh
The number of dengue cases exhibited a decreasing trend since the outbreak in 2000 until 2010 ( Fig 3A). After 2010, the number began to increase until a drop in 2014, and then grow again till 2018. Over 5,000 infections were reported in 2000, 2002 and 2016. In contrast, in 2018, over 10,000 cases were reported. Dengue fever occurs primarily between July and November each year ( Fig 3B). Therefore, monthly climate predictors were selected prior to July to predict the overall annual incidence.

Climate variabilities
Different combinations of temperature (monthly average, maximum or minimum), rainfall (monthly total or maximum) and sunshine duration were used to predict annual dengue incidence. The monthly minimum temperature in Bangladesh increased after January and continued to increase until June/July, with the highest temperature of 26.4˚C measured in 2010 and 2014 (Fig 4). However, large variations in temperature were observed between March and June. Rainfall increased between April and October (S1 Fig). The highest amount of rainfall usually occurs between May and August, with low levels of rainfall recorded in January, February and December. Sunshine duration in January to April and in December exhibits a decreasing trend over the years, an indication of warmer winter (S1 Fig). Longer sunlight duration is mostly observed in the period from April to June.

Model selection and annual dengue prediction
To obtain an appropriate set of predictors for annual dengue prediction, we compared six models with different combinations of climate variables ( Table 1). The best prediction model was determined using a two-stage model selection approach. In the first stage, Model 3 and Model 6, which belonged to either the low or the lowest AIC c models, were chosen based on the criteria defined in Eq (2). For details of the stepwise AIC c results, please refer to S2-S7 Tables. In the second stage, LOOCV was conducted for the selected models. After LOOCV was performed, Model 3 was identified as the best prediction model, with the lowest mean squared error for the validation set, compared with Model 6 (0.29 vs 0.31; see Table 2). Model 3 was

Marginal effect of climate predictors
To check the impact of each climate variable individually on annual dengue incidence, we further assessed the marginal effects of climate predictors. The optimal minimum temperature for mosquito population expansion is around 21-23˚C (Fig 6). There was an upward trend of temperature from January until June. During this six-month period, the marginal effects of mean minimum temperature from January to March were positive. In contrast, the effects were negative from April to June. Thus, the results indicate that a turning point of marginal effects was located between 21 and 23˚C. Starting from 2,500 predicted cases with a temperature of 23˚C, the number of predicted cases gradually declined to below 1000 predicted cases in April if the temperature was increased by two degrees. The similar patterns were also evident in May and June.

PLOS GLOBAL PUBLIC HEALTH
Prediction of dengue annual incidence using seasonal climate variability Rainfall also had different effects depending on the time. In February, April and June, rainfall had a positive relationship with dengue incidence (Fig 7). In contrast, rainfall in a cooler winter period (January) had a negative association with dengue incidence.
The mean sunshine duration in April and May were 7.4 and 6.5 hours, respectively and were negatively associated with annual dengue incidence (Fig 8). Comparing the magnitude of all climate predictors, minimum temperature in June (T6), sunshine duration in May (S5) and rainfall in February (R2) had the strongest effects on annual dengue incidence.

Discussion
Climate change poses a great threat to global health, particularly for subtropical and tropical climate regions due to the expansion of dengue fever. Since 2010, Bangladesh had an increasing total number of dengue cases, except in 2014, during each seasonal epidemic. In 2018, the recorded number of confirmed and suspected cases was more than 10,000, including 26 confirmed deaths [3,34]. Although many studies have attempted to estimate dengue incidence in Bangladesh using climate data, most of these studies focused on data collected prior to 2010 [15][16][17]35]. We developed a model to estimate the impact of various climate factors in the lead-up to dengue seasons, which typically occur during the monsoon season in Bangladesh. This is the first study to demonstrate that climate variability before dengue season can explain dengue expansions in Bangladesh in the past 20 years, suggesting that an early warning system can be built for this area.
Dengue fever has been increasing the health burden worldwide, including in South and Southeast Asian countries. Previous studies have been conducted in these regions to investigate the effects of climate change on dengue fever spreading [35][36][37][38][39][40]. In these studies, temperature, rainfall, and humidity were the commonly used climate predictors that can possibly influence dengue outbreaks. These studies showed an overall effect of particular climate variables throughout the years. In contrast, in our study, we explained that the impact of climate variables depends on the time of a year and is capable of predicting dengue cases before starting the peak season. It would help to mitigate any upcoming severe incidence in Bangladesh as it allows sufficient time to preparedness.
Our study suggests that some climate factors might exert opposing effects on the annual number of dengue cases, depending on the time of year. For example, minimum temperatures from January to March were positively associated with dengue cases, whereas a negative association was seen in the subsequent months from April to June, closer to the start of dengue season. This may be due to a complex dependency between the population dynamics of the dengue vector and the changing environment, such as the seasonal transition from winter to summer and the associated increasing temperatures. One study claimed that temperatures of 21.3-34˚C are optimal for expansion of Aedes aegypti populations [21]. As the average daily minimum temperature is lower than 21˚C before April and higher than 23˚C during and after April, our results suggest an optimal daily minimum temperature in the range of 21-23˚C.
Total rainfall in a winter month (January) was found to have a negative relationship with dengue cases, whereas a positive relationship was seen for later months mainly in summer such as April and June. Rainfall is thought to have both beneficial and harmful effects on mosquito population growth. Rainfall can provide standing water for mosquito breeding. However, an excessive amount of rainfall (e.g., heavy rainfall during monsoons or cyclones) has been commonly thought to be able to disrupt potential mosquito habitats [41]. Our study has identified a negative relationship between rainfall and dengue incidence in January. This may be because during this period the number of adult mosquitoes is low, meaning that rainfall has a larger negative impact by flushing away mosquito eggs than a positive impact due to creating habitats required by adult mosquitoes. A similar pattern of a negative association of pre-dengue-season rainfall with dengue cases has been seen in recent studies [22][23][24].
Additionally, sunshine duration was found to be closely linked to mosquito-related activities, such as frequency of mosquito bites [42]. However, sunshine duration has not been included in any prediction models thus far. Therefore, an evaluation of the increasing dengue incidence since 2010, with respect to these climate variables, is warranted. A shorter duration of sunshine is more favorable for dengue transmission. In general, mosquitoes are more active in darker environments, and there is a greater chance of dengue being transmitted during periods of less sunshine due to the increasing frequency of mosquito bites [43]. The marginal effect of sunshine duration in Fig 8 reveals that the shorter the duration of sunlight, the higher the number of dengue cases, supporting the biological characteristics of mosquito activity as described by [44]. A 2-hour reduction in sunshine duration in April and May was predicted to result in a 3-fold increase in annual cases. These estimates are consistent with a previous study that found a negative association between sunshine duration and dengue incidence [43].
While this study has an implication, some limitations exist in this study. Meteorological data for 2019 were not disclosed at the time of this study; hence, 2019 data were excluded from the models. Secondly, the small number of outcome values (19 data points) and a relatively large number of predictor variables might lead to overfitting. Dengue prediction in 2002 and 2007 might be the consequence of it. Moreover, the epidemiological data includes both laboratory confirmed cases and probable case, hence, actual estimate might be affected due to under/ over reporting biased. Because we aim to estimate the effects of climate variability before dengue season on annual incidence in this study we did not plan to include monthly predictors within dengue season.
In conclusion, our research offers a potential alert system by modeling annual dengue outbreaks before the season begins using climate variables. As an early warning system is required to improve public health and safety, the model we developed may improve disease control systems in Bangladesh. This research will aid our understanding of the effects of climate variability on dengue expansion, not only in Bangladesh but also in northern India and other Southeast Asian countries with similar climates and social-economic conditions.  Table. Variance inflation factor (VIF) for the predictors used in the best prediction model: Monthly minimum temperature, monthly sunshine duration, and monthly total rainfall. (PDF) S2 Table. ( Model 1) Step-by-step forward selection results of the generalized Poisson regression model for each step based on AIC c . ave.T i , S i and tot.R i represent mean temperature, sunshine duration and total rainfall in the i th month. For each of the variables included in the model, the corresponding AIC c , the leave-one-out mean squared error for the validation set (MSE Va ), the leave-one-out mean squared error for the training set (MSE Tr ), and the mean  Table. Leave-one-out cross-validation (LOOCV) results for Model 6. Achieved by omitting the j th year in the j th iteration, where j = 1, . . ., 19, and j = 1 indicate the year 2000, j = 2 indicates 2001, etc. Italic font denotes the predicted annual dengue cases when the j th year is removed. (PDF) S9 Table. Leave-one-out cross-validation (LOOCV) results for Model 3. Achieved by omitting the j th year in the j th iteration, where j = 1, . . ., 19, and j = 1 indicates the year 2000, j = 2 indicates 2001, etc. Italic font denotes the predicted annual dengue cases when the j th year is removed. (PDF) S10 Table. Parameter estimates of the best prediction model based on quasi Poisson regression. SD represents the standard deviations of the estimate of each predictor. Asterisks in the p−value indicates that the predictors are significant with certain levels (i.e. �� = 0.001; � = 0.01). (PDF) S11 Table. Comparison between observed and predicted annual dengue cases in Bangladesh between 2000 and 2018. The lower and upper boundaries represent the lower and upper limit of the 95% bootstrap confidence interval, respectively, for the predicted value. (PDF) S12 Table. Comparison of negative binomial regression models based on AIC c . (PDF) S13 Table. Comparison of the validation results among the best fitting models in negative binomial regression. (PDF) S1 Data.