Figures
Abstract
Malaria in pregnancy (MIP) remains a global health challenge, affecting approximately 40% of pregnant women. Despite malaria control efforts by the Nigerian Government and its partners, regional disparities in health outcomes and malaria incidence trends among pregnant women remain under-studied. This study objectives were to assess MIP variability compared to general malaria cases, and forecast short-term MIP incidence over two years. This was achieved by analyzing malaria in pregnancy (MIP) variability across Nigeria from January 2015 to January 2025, using wavelet coherence, patterns of transmission cycles and selecting best modelling approach by comparing ARIMA and SARIMAX models to assess temporal trends before the forecast of short-term MIP incidence. Findings showed significant regional variability, with Cross River peaking in 2017 and 2019, while Enugu recorded its lowest trough in 2017. Malaria peaks in southern states remained lower than troughs in northern regions. Strong cross-correlations between MIP and general malaria transmission cycles were observed in Kebbi, Niger, Yobe, and Ondo, indicating persistent trends, while South-South and South-East exhibited weaker correlations, likely due to intervention fluctuations. SARIMAX models captured MIP trends more effectively, except Kebbi, where ARIMA fit better, and Niger, where SARIMAX exaggerated forecasts due to sensitivity to exogenous variables. Thus, SARIMAX was adopted for Cross River, Enugu, Ondo, and Yobe; while ARIMA was used for Kebbi and Niger States. It was discovered that Cross River and Enugu exhibited intervention-driven malaria fluctuations, Ondo, Niger, and Yobe displayed unstable or cyclical trends, reinforcing the importance of climate-sensitive forecasting models and seasonal interventions for improving malaria prediction accuracy. South-South and South-East need improved healthcare access, North-Central and North-West require seasonality forecasting, while North-East demands urgent control measures. Targeted malaria interventions are crucial to support achievement of the Nigeria’s National Malaria Elimination Programme (NMEP) goals.
Citation: Oniyelu DO, Folorunsho O, Adewole L, Bakare EA, Okoronkwo C, Eze N, et al. (2025) Time series analysis of malaria in pregnancy, using wavelet and SARIMAX models. PLoS One 20(8): e0328888. https://doi.org/10.1371/journal.pone.0328888
Editor: Clement Ameh Yaro, University of Uyo, NIGERIA
Received: March 17, 2025; Accepted: July 8, 2025; Published: August 6, 2025
Copyright: © 2025 Oluwaseun Oniyelu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data for this study are publicly available from the figshare repository (https://doi.org/10.6084/m9.figshare.29145854.v1) and the GitHub repository (https://github.com/oniyelu/Timeseries.git).
Funding: This study was funded by a grant from the BMGF (INV-047051). The funders had no role or influence on the design and interpretation of the data collected, as well as in writing the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Malaria is an endemic disease with high prevalence in Sub-Saharan Africa, where Nigeria carries the highest burden [1,2]. It is a parasitic disease that is transmitted by infected female Anopheles mosquitoes during their blood meal. This is common in endemic regions due to the favourable climatic conditions with the prevalence of Plasmodium falciparium being the major vector specie particularly in the Nigeria [3]. According to [4], there is about 40% possibility of exposure to malaria in pregnancy (MIP) in the world. World Health Organization (WHO), recommended that interventions including: intermittent preventive treatment with sulfadoxine-pyrimethamine (IPTp-SP) and long-lasting insecticidal treated net (LLIN) be used in the control of MIP [5,6]. In Nigeria, the last decade of malaria control has witnessed a huge increase in effort by the Government and its partners towards the up-scaling of key interventions such as mass campaigns for replacement of insecticide-treated nets (ITNs), intermittent preventive treatment of malaria during pregnancy (IPTp) and malaria case management [7]. Malaria is a major public health challenge, with Nigeria accounting for nearly 27% of the global malaria burden [8,9], which is a significant component of the global malaria burden. Pregnant women still remain one of the most vulnerable populations, despite substantial efforts to reduce the prevalence of malaria [7] among this population. The variability in the incidence of malaria differs based on varying impact of malaria and the peculiarity of different geographic locations, making the elimination of malaria specific for different regions of the country [10]. Nigeria has different climate and topography with uplands (600 to 1,300 meters in the North Central Zone), lowlands (less than 20 meters in the coastal areas) and highlands in the eastern parts [11].
The Nigerian Government commenced the deployment of LLIN since 2009 [12] and IPTp in 2005 [13]. According to the 2021 Nigeria Malaria Indicator Survey (MIS), it was reported that regarding the use of ITNs by pregnant women, 50% of pregnant women aged 15–49 were reported to have slept under an ITN the night before the survey [7]. On the other hand, 31% of women aged 15–49 with a live birth in the 2 years preceding the survey reported taking three or more doses of sulfadoxine-pyrimethamine (SP) during their last pregnancy [7]. It was observed by [14] that more than 120 million pregnancies from 2010 to 2020 in malaria-endemic parts of the world, recorded an annual infant mortality rate ranging between 75,000 and 200,000. Pregnant women who took SP twice or three times recorded lower incidence of malaria in their fetus than those who took it only once [15,16]. MIS reported that the usage of IPTp-SP among the vulnerable pregnant women population improved from 34% to 50% in 2021 [7]. Although several study on some States of the country have been carried out on efforts to reduce malaria, particularly during pregnancy in Nigeria [17–21], little is known about the role that regional variations played on health outcomes of MIP incidence and the temporal pattern over the years in the varying regions. MIP is strongly associated with infant low birth weight due to prematurity, intrauterine growth restriction (IUGR), maternal anemia and elevated neonatal mortality [22,23].
These factors increase the risk of neonatal morbidity and mortality, highlighting the urgent need for monitoring and evaluation of the progress made on prevention strategies. While malaria control efforts by the Nigerian Government and international health partners have led to improvements, regional disparities in MIP incidence and health outcomes remain insufficiently embarked. Despite the fact that many have studied on general malaria prevalence [24], there exist limited research which have explicitly looked at the regional and temporal patterns in MIP throughout Nigeria. Rather than examining how MIP incidence varies, studies on the efficacy of IPTp coverage and vector control initiatives are not streamlined to MIP transmission [25,26]. This gap presents challenges in designing region-specific interventions to mitigate MIP cases in high-risk zones, which helps targeted interventions and continued evaluation of the effect of malaria on this vulnerable group. There is therefore a need to carry out a pre-analysis of the trends and understand the variability in the patterns of MIP incidence within the 6 regions of Nigeria. The geo-political regions are grouped as: South-Southern, South-Eastern, South-Western, North-Western, North-Central and North-Eastern Nigeria. In this study, each region is represented by a State from the region having one of the highest burden of malaria in that region based on MIS 2021 report. This study seeks to analyze the variability of MIP incidence across these Nigerian regions using time series methods, comparing its trends with general malaria incidence. Forecasting of future incidence of MIP using wavelet coherence to denoise data and selection of best modelling approach by comparing ARIMA and SARIMAX models is also carried out. By translating coherence cycles into real-world malaria transmission periodicity, insight into seasonal malaria trends and role of exogenous factors such as pandemic disruption or intervention is analysed. Understanding short-term fluctuations and seasonality will help refine policy approaches for targeted malaria control programs, ensuring region-specific strategies that align with Nigeria’s National Malaria Elimination Programme (NMEP) goals.
Time series are made up of records or data points collected over a certain time range [27], for which analyses of datasets with a timestamp index can be carried out to obtain the pattern or trend inherent in the data [28]. Analyzed time series data can be used to forecast or predict other data points which could be of great importance in making early decisions for improved health outcomes [29]. Univariate or multivariate time series analysis can be approached using the time domain or the frequency domain [30–32]. Artificial Intelligence (AI) techniques are needed in the public health sector [33] and there is the growing acceptance of Machine Learning (ML) use, in identifying life-threatening diseases early for the improvement of patient survival rates [34–39]. This is demonstrated in the study of [40], which presented artificial intelligence (AI)-inspired techniques to record real-time forecasted COVID-19 and forecast the incidence of various provinces Various literature used diverse approaches including ML and simple statistics-based time series as forecast or predictive models [34,41–44], however, these algorithms are seen as not easily interpretable models and medical practitioners are skeptical about the forecast or predicted outcomes of such models [45]. There is therefore the need for interpretable models or use of eXplainable Artificial Intelligence (XAI) modelling approach. XAI are efforts developed to enable the understanding of the inner AI processes followed to arrive at a models’ outcome [46]. To achieve this, interpretable models such as ARIMA [47], SARIMAX [48] and wavelet models [49] which are time series analysis tools was used in the study to carryout pre-analysis of the trends of MIP. This is also used to gain understanding of the variability inherent in the patterns of various regions in Nigeria.
Time series analysis in the time domain can be achieved using either the auto correlation analysis on univariate data (such as Autoregressive Integrated Moving Average: ARIMA model) or cross correlation analysis on multivariate datasets [50–52]. ARIMA models are based on a combination of past trends (autoregressive portion), past forecast errors (moving average component), and past differencing (integrated part) in the various regions [53]. This facilitates comprehension of the model’s operation and prediction-making process. Time series analysis in the frequency domain can be achieved either by using the spectrum or wavelet transform [54,55]. Wavelet analysis can be considered an interpretable model, providing a clear interpretation in both spatial and frequency domains, making them suitable for many applications [56]. They can be used to analyze time series data to implement denoising of the data and viewing of inherent characteristics of the data in the frequency domain [57,58]. Data denoising is important due to the presence of varying random noise and seasonal changes that could be found in the time series [59].
In [60] work, an hybrid methodology was applied using discrete wavelet decomposition to the death due to COVID ’19 dataset by spliting the input data into component series and then using ARIMA to make predictions. The prediction error of their result was compared to that obtained from an ARIMA model which showed the performance of prediction from hybrid wavelet-ARIMA model to give more accurate result. The limitation of the work however lies in the use of Daubechies wavelet to carry-out the analysis, whose strength is limited to noise reduction and data compression. The work of [61] studied time series forecasting techniques using two hybridization techniques that integrated wavelet ARIMA and GARCH, with discrete Fourier transform for the ARIMA pre-processing. This enhanced the ARIMA forecast by utilizing each model’s capacity to identify both linear and non-linear patterns seen in a time series. They pointed out the need to further rate the effectiveness of this method in other fields. The work of [62], separated feature sets with and without Discrete Wavelet Transform (DWT) to compare deep learning models intended to predict the daily number of COVID-19 incidence and deaths for 183 countries. The results from the homogeneous architecture comprising multiple LSTM (Long-Short Term Memory) layers, and the hybrid architecture merging multiple CNN (Convolutional Neural Network) layers and multiple LSTM layers showed a statistically significant difference between the models’ performances both for the prediction of deaths and confirmed incidence (with p-value <0.001). The CNN+LSTM model performed similarly when wavelet coefficients were included as extra features (DWT+CNN+LSTM), indicating the promise of wavelets as an optimization tool. To forecast short-term trends in confirmed COVID-19 cases across several locations, including the US, Asia, Europe, Africa, and others, [63] used the ARIMA and ARIMAX models. The study also looked at the connection between vaccination rates and the number of new cases, as well as the effects of socioeconomic variables like GDP per capita and healthcare resources on COVID-19 incidence rates in various nations. They however noted the need for further research on the importance of region-specific strategies in understanding the variability in outcomes across different regions. According to [64], the yearly and monthly vaccination patterns of regular childhood immunization programs was assessed while taking the COVID’19 pandemic disruption into account. ARIMA model was used for predictive modelling to show how vaccination rates varied by area and how seasonal variations were seen in monthly vaccination rates, with Bacillus Calmette-Guérin (BCG) vaccine having the most consistent pattern. They were able to get insight on the dynamics of childhood vaccination, although due to lack of historical data (ranged only between: 2016–2018), their observation of long-term patterns was also limited.
To further validate the use of hybrid wavelet-ARIMA model, pre-analysis of data that combines the strength of trend filtering and cross-correlation on time series data while also performing denoising using the morlet wavelet transform algorithm is important. This was carried out in this study on historical data of a long-term observations on MIP cases from January 2015–January 2025), to get better understanding on the nature of the trend and varying patterns of MIP in the various regions of Nigeria. Morlet wavelet whose strength lies in the decomposition of time series to the time-frequency domain was thus proposed to get better insights from MIP time series. An hybrid wavelet model was applied with ARIMA/SARIMAX model to analyse the data and improve forecast results and performance accuracy. This study addressed questions centered on understanding i) what the long term trend in MIP has been from January 2015 till January 2025, in each region of Nigeria, ii) what the variability of MIP incidence over time had been compared to the general malaria incidence (MC) in each region, and iii) how short-term trends of MIP incidence of months to come can best be forecasted. The aim of the study was to carryout a time series pre-analysis of MIP incidence in Nigeria (Jan 2015–Jan 2025) in comparison with the general MC, and provide a 2 year forecast of possible future trends, using an hybrid of wavelet and ARIMA/SARIMAX models. This was carried out within six highest burdened States with each serving as representatives of the six (6) geo-political regions of Nigeria namely: South-Southern (Cross Rivers State), South-Eastern (Enugu State), South-West (Ondo State), North-Western (Kebbi State), North-Central (Niger State), and North Eastern (Yobe States) regions. The specific objectives were to: i) carryout pre-analysis and visualization of MIP and MC long-term trend, ii) analyze the variability of MIP incidence over time compared to general MC incidence, and; iii) forecast a 2 year short term trends of possible future trends of MIP incidence. This would help in gaining valuable insights into the underlying transmission pattern of malaria within the pregnant women population. Generating these forecast for years to come would help support the understanding of the nature of malaria trend in the country in the past and inform planning, preparedness and decision making of the public health sector of Nigeria, in the enrollment of control in reducing the further spread of malaria endemicity in Nigeria.
Wavelet techniques have been effectively used in many fields and building on the strength of both wavelet and ARIMA as an hybrid model [44,61], this study pre-analyses MIP incidence data of Nigeria regions, represented by six states. The wavelet-ARIMA hybrid combines the strengths of both approaches by using wavelet transformation to enhance noise reduction and improve prediction accuracy. Wavelets are used as optimization tool to achieve enhanced preprocessing of the time series dataset and improve the understanding of the seasonal variability present in the historical data of MC and MIP counts. In achieving the ascertainment of the effectiveness of malaria control efforts in Nigeria, especially in pregnancy, and understanding of the temporal pattern over time in various regions of the country; the ARIMA and Wavelet hybrid modelling approach was adopted. This will help the pre-analysis of MIP monthly incidence time series data, using a 10 years historical data in gaining insight to long-term patterns in six selected hotspot states, representative of the six geo-political zones in Nigeria. The achievement of great accuracy in the monitoring of the progression of malaria would enable strategic planning towards to ultimately enhance monitoring and surveillance strategies of the public healthcare systems in attaining elimination of MIP.
Materials and methods
The time series analysis for studying the trend of malaria transmission using ARIMA and wavelet involved pre-analysis of Routine MIP cases and the general malaria cases count data (described in Fig 1, from National Malaria Data Repository (NMDR) of NMEP [65], Nigeria for the 6 selected states (shown in Table 1). The monthly Malaria confirmed Pregnant Women and Confirmed uncomplicated malaria cases from routine data of the National Malaria Data Repository was used to carryout the time series for each state or local government area. The complete raw datasets, including the monthly MIP and MC time series for each state or local government area, are available upon request [65]. These States represents high burden State in those 6 geo-political regions in Nigeria as shown in Fig 2. The figure was created by the author using R, based on openly available country boundary data (Administrative boundary data) in figure generation, obtained from GADM [66] (https://gadm.org/license.html), which permits academic use and no copyrighted content was used. Other variables present in the dataset include: Malaria diagnosis status, Case management and interventions, LLIN data, etc. The workflow of the timeseries analysis (as sampled in Fig 3) includes: firstly the pre-analysis and visualization of MIP and MC long-term trend. Secondly, the Variability of MIP incidence over time compared to general MC incidence, which involved analysis of MIP incidence over time compared to that of the general MC incidence was carried out by: i) analysis of MIP incidence seasons of the year, compared to that of MC and ii) Cross-correlation analysis of MIP and MC change over time. Lastly, forecast of short term trends of MIP incidence is generated, based on selected ARIMA or SARIMAX model. The data analysis was carried out using the R Statistical Programming Software. Author-generated code has been made available in the link: https://github.com/oniyelu/Timeseries.git.
Source: Source: Figure generated by the author using shapefiles and geospatial data with R programming, based on country boundary data from GADM, freely available for academic use (https://gadm.org/license.html).
Pre-analysis and visualization of MIP and MC long-term trend
The confirmed uncomplicated malaria and malaria confirmed pregnant women count data, each representing the MC cases and MIP cases were the data feature selected for time series analysis. These features were preprocessed by carrying out basic statistics to understand the nature of the distribution of the malaria cases data values, to minimize the presence and effect of outliers and to resolve issues associated with missing values. In handling missing values which were observed from a first glance at the data, Shapiro-Wilk Test was carried out on the time series to determine if it is normally distributed, in guiding arrival at the proper value imputation process to adopt. Square root transformation was used to help linearize the data and manage the effects of outliers/handle scaling by stabilizing the variance in the data, which are very helpful for data transformation and regression models [67]. A basic statistics to generate the minimum value, maximum and average mean was also carried out to understand the variance in the data distribution.
Square root transformation ensured scaling transformation keep important patterns and correlations in the data preserved while avoiding overly smoothing. it also helped in reducing skewness and making data more normally distributed. The malaria cases data was made up of discrete counts of MIP and MC cases. Missing data (as summarized in Table 2) were resolved by applying the Kalman smoothen imputation of data for each LGA per year to fill up the missing data of each state. This approach to filling missing values is a robust way to carryout smoothen of the values and account for state-space model that takes into consideration spatial correlations, seasonal effects, and temporal dependencies when imputing missing values. The algorithm followed to carryout the Kalman smoothing is shown as follows:
- By estimating the state xt of a system at time t, the Kalman filter operates using: State Equation (Transition Model)
(1)
where: - A: models temporal dependencies (such as trends), - But: represents external inputs (e.g., seasonal effects), - wt: process noise (assumed Gaussian). - Observation Equation (Measurement Model) is given as:
(2)
where: - yt: observed data ( COVID ’19 pandemic effects), - C: an observation matrix, -: measurement noise.
- The function “build_model()” defines a dynamic process variance (dW), meaning dW changes according to the impacts of disturbance rather than being fixed. An pandemic-induced disruptions values which occurs between Mar. 2020 - Dec. 2020, are also modeled using exponential decay:
(3)
With the assumption that: gradually diminish over time form Jan. 2021 till date. Where: dlm Filter applies Kalman filtering, estimating missing values by recursively updating the state estimate; and dlmSmooth refines predictions by adjusting errors based on historical trends.
The data size of all datapoints was maintained, which was essential for time series, and so records having missing values were not discarded. In analyzing data to better understand the trends, more data points were needed for improved data analysis in AI. Thus the monthly data records for each State were used at the LGA level of malaria cases. Stationarity tests were conducted using the Augmented Dickey-Fuller (ADF) Test [68], which verified if the time series were stationary [68], being an important prerequisite for time series data pre-analysis. The original data and the preprocessed data were both visualized to comparatively understand the need for preprocessing in improving the analysis outcomes. After these preprocessing steps, the plot was further decomposed to understand the long term trend in MIP and MC cases over the years.
Variability of MIP cases over time compared to general MC cases
Understanding the variability of MIP cases over time compared to that of the general MC cases was approached by i) analysis of MIP cases seasons of the year, compared to that of MC and ii) Cross-correlation analysis of MIP and MC change over time. The trend of MIP and MC over the years happened across the different seasons of the year, thus using the time series plotted according to the seasons of the year, this study looked at the the seasons in which MIP and MC had the highest peaks and related this to the seasons of the year. This showed us at first glance if they both attained their highest peak at the same or close to same time and if the season of the year had any role to play in the outcomes.
Next, the Bi-wavelet coherence analysis [69] was used to examine the cross-correlation in the time series data between MC and MIP cases over the years. Analyses of the pattern changes over time of the MIP cases in relation to changes observed in general MC cases over time. This helped in the understanding of the correlation in the pattern of malaria cases within the pregnant women population, compared to that of the general population in the time and frequency domains. With the aid of the biwavelet analysis being visualised, heatmaps in time and frequency domains, the common behaviour and time-localized patterns was found between the two time series for the various states. The understanding of the data for the frequency content variation over time, are made easier with its ability to quantify the correlation between two time series in the time-frequency domain [70].
Forecast of future trends of MIP cases
To develop the best ARIMA and SARIMAX model used in forecasting of MIP cases for each sub region, wavelet analysis using the Morlet wavelet transform was first used to carryout the denoising. After denoising, the original and the reconstructed data (denoised data) was visualized to see the output, in comparison to the original data not yet denoised, before carrying out further analysis. The denoised data was then used to carry out parameter estimation for determining the initial parameter values for the ARIMA model of which was further processed with the original MIP data to arrive at the best ARIMA and SARIMAX model for each region. Forecast of a 2 years future MIP cases was generated, based on best fitted forecast after evaluation analyses.
Wavelet time series analysis.
Wavelet techniques come in a variety of forms, each with special characteristics that enable them to be applied to various time series analysis applications [71]. A mathematical method known as the Fourier Transform (FT) breaks down a function into a sum of sine and cosine functions (waves) by breaking down a time-domain signal into its component frequencies [72]. When examining the frequency content of signals, this is helpful. An alternative to the FT that offers both time and frequency information is the Wavelet Transform (WT), which breaks down a signal into wavelets, which are confined in frequency and duration [73]. Among the common wavelet techniques are the following: the simplest wavelet is the Haar Wavelet, which looks like a step function [74]. Daubechies created the Daubechies Wavelets, which are employed considerably in signal processing, particularly for noise reduction and data compression, because of its compact support and orthonormal shape [49]. A Gaussian window is utilized by the Morlet Wavelet, a type of wavelet that is often used in time-frequency analysis and offers a good balance between time and frequency localization [49]. Mexican Hat Wavelet, on the other hand, is a Gaussian second derivative, which is the Gaussian function’s negative normalized second derivative [49]. The Morlet wavelet is excellent for time-frequency analysis which was adopted for the analysis in this study. The algorithm followed in analysing the data involves the following:
- Define and set up Parameters (a change in malaria cases values , number of scales (i) that can be changed as needed, and angular frequency). Next, the scale-adjusted Morlet wavelet function is defined. Scale-adjustable Morlet Wavelet Function ana algorithm is given as:
(4)
where t is time, and s is the scale. - Next, make scale and loop adjustments to increment j by 2 Until
:
(5)
where sj represents the scales. - Initialize Wavelet Coefficients Matrix:
(6)
for all j and k, where W is the wavelet coefficients matrix. - Use Adjusted Scales to Calculate Wavelet Coefficients:
(7)
where x(t) is the signal. - Compute Wavelet Power Spectrum:
(8)
- Convert Scale to Fourier Period:
(9)
- Take the real part of the matrix coefficient and get a denoised time series back together:
(10)
whereis the denoised time series.
ARIMA model for time series analysis and forecasting.
For forecasting to be effective, non-stationary data must be converted into stationary data. This was done by using differencing which is a component of ARIMA’s “Integrated” component of the ARIMA and SARIMAX model applied to the time series. Once the series was stationary, the training of the ARIMA model was carried out. This involved specifying the order of the autoregressive (AR), differencing (I), and moving average (MA) components. The residual check on the models are used to show if the models could not fully capturing the variance structure (i.e., conditional heteroskedasticity), or the serial correlation in the data. If present, this could limit the reliability of the forecasts generated and violate key ARIMA assumptions (white noise residuals). To account for the seasonality present in the data and other exogenous factors, SARIMAX model was also adopted to carryout comparism on both model in analysing the data. Exploring Seasonal ARIMA (SARIMAX), for autocorrelation and heteroskedasticity that could be due to seasonalilty or other factors that ARIMA failed to capture, are further analysed using seasonal differencing and seasonal AR/MA terms.
The SARIMAX (Seasonal AutoRegressive Integrated Moving Average with Exogenous Variables) model which extends SARIMAX by incorporating external factors (exogenous regressors) was analysed. COVID ’19 pandemic disruptions deduced from the data was one of the exogenous factors considered and eventually used. IPTp coverage data was another exogenous factor initially considered, but it was observed that the IPTp data from DHIS was a yearly point data. Although the State level IPTp data was available, however aggregating MIP cases data to get the it as yearly data to maintain the same scale before carrying out forecast was however not appropriate. Also, LGAs level IPTp data for each State coverage level were not accessible, thus IPTp as an exogenous factor was not used. Further analysis and decision-making was established by utilizing wavelet analysis in combination with SARIMAX to improve forecasting and analysis of the underlying patterns and fitness of model to the time series data.
The reconstructed wavelet time series: was used by the algorithm within the given model: ARIMA(p, d, q). The Objective function
was used repeatedly to identify the optimal model: Minimize
, and finally generating forecasts:
with 95% confidence intervals. Once the optimal ARIMAX and SARIMAX model fit had been obtained, the model’s coefficients were utilized as starting points for model optimization through the application of the Kalman filter [48]. The ranking criteria was ranked by prediction error measures (including: Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE).
The algorithm to carryout the ARIMA modelling approach follows thus:
- After the reconstructed the wavelet time series is generated:
, SARIMAX models which extends ARIMA is defined by adding:
- Autoregressive (AR) Component
(11)
where: AR terms,
: MA terms, and
: white noise.
- Integration (I) Component: To ensure stationarity, differencing is carried out
(12)
where B: backshift operator. - Moving Average (MA) Component
(13)
- Seasonal ARIMA (SARIMAX) Extension captures seasonal patterns with additional terms:
(14)
where Bs is seasonal backshift and (P, D, Q) defines seasonal behavior. - Exogenous Variables X in which SARIMAX incorporates external predictors:
(15)
whereadjusts for external factors (pandemic disruptions based on data).
Note that: Model selection was done using AIC (Akaike Information Criterion). Training/test split of dataset was based on a 70% train set in training the models and 30% test set applied in carrying out evaluation of model parameters performance). Exogenous effects (train_disruption, test_disruption) are applied to modify predictions. Performance evaluation was done using RMSE and MAPE
- Autoregressive (AR) Component
- The optimal SARIMAX model was chosen using the Akaike Information Criterion (AIC):
(16)
where: - k: the number of parameters in the SARIMAX model. - L: the likelihood function of the model.- Iteration over different combinations of parameters (p, d, q) and (P, D, Q) is done while tracking AIC values to determine the best model. To find the best model, there is a loops over possible values of p, d, and q, as well as their seasonal counterparts P, D, and Q:
(17)
(18)
(19)
With the addition of external regressors and seasonal components, each iteration fits an ARIMA model.
- Iteration over different combinations of parameters (p, d, q) and (P, D, Q) is done while tracking AIC values to determine the best model. To find the best model, there is a loops over possible values of p, d, and q, as well as their seasonal counterparts P, D, and Q:
- By include exogenous regressors, the generic SARIMAX formulation expands upon ARIMA. Xt, which given as:
(20)
(21)
where: -: the autoregressive coefficients. -
: the moving average coefficients. -
:the error term. -
: captures the effect of the exogenous variable (i.e pandemic disruption).
- After fitting each possible SARIMAX model, the AIC values are compared:
(22)
The best model is selected as:(23)
- Forecasts with 95% Confidence Intervals was created by using the optimal ARIMA model.
for h steps ahead:
(24)
(25)
The forecast’s 95% confidence interval is given by:(26)
whereis the forecast’s standard error.
To include RMSE (Root Mean Squared Error), MAPE (Mean Absolute Percentage Error), and ACF (Autocorrelation Function) in the evaluation of the best ARIMA model, AIC balanced model fit and complexity, RMSE measured the average magnitude of the prediction errors and MAPE measured the prediction accuracy as a percentage. These loss functions helped in evaluating and selecting the best model for the data. The function to calculate these metrics for each model were included in defining the function to fit the ARIMA model based on AIC. After getting the best fit ARIMA model, the AC,F PACF,and the test of the residual error including Jarque-Bera (JB) Test for Normality, Lagrange Multiplier (LM) Test for ARCH Effects and Ljung-Box Q (LB-Q) Test for Autocorrelation [75] was also carried out to evaluate the goodness of the selected models fit to the data. The definition of the metrics are as follows:
- The AIC is a measure of the relative quality of a statistical model for a given set of data. It balances the goodness of fit of the model with the complexity of the model.
Mathematically, AIC is defined as:(27)
where: - k is the number of parameters in the model. - L is the maximum value of the likelihood function for the model. - Root Mean Squared Error (RMSE) is a measure of the differences between predicted values and observed values. It is the square root of the average of the squared differences between predicted and observed values.
Mathematically, RMSE is defined as:(28)
where: - yi is the observed value. -is the predicted value. - n is the number of observations.
- Mean Absolute Percentage Error (MAPE) is a measure of prediction accuracy of a forecasting method. It expresses the error as a percentage of the actual values.
Mathematically, MAPE is defined as:(29)
where: - yi is the observed value. -is the predicted value. - n is the number of observations.
- Jarque-Bera (JB) Test for normality checks if the p-value of model was > significance level (p-value geq 0.05), then the residuals were normally distributed.
- Lagrange Multiplier (LM) Test for ARCH Effects checks if the p-value was greater than significance level, then there were no significant ARCH effects present.
- Ljung-Box Q (LB-Q) Test for autocorrelation checks if the p-value was less than your chosen significance level, then there were significant autocorrelation in the residuals.
Results and discussion
Analysis of long term trend in MIP cases
As noted by [76], when data is not preprocessed, it can affect the eXplainability of methods used in analysis and the overall integrity of results generated after analysis. Preprocessing of data is very essential and the importance of more data points is needed to enhance the accuracy and applicability of model analysis as observed in [77]. To really understand the long term trend in MIP cases, a closer look at the time series was done by firstly visualizing its original time series state. As seen in the time series plots Fig 4, it was observed that there exist several missing data in the MIP time series, compared to the MC time series plot. The missing record were based on no updates to records of MIP majorly from year January 2021 to January 2025. To help determine the values for imputing in replacing all missing values, each time series data for each State was further tested to determine if it had a normal distribution, by applying Shapiro-Wilk normality test (results as shown in Table 3) and plotting the histogram (as shown in Fig 5) of the MIP.
It was observed from the Shapiro-Wilk test, that all the timeseries were not normally distributed as shown in the histogram plot which showed most of them to be heavily right skewed. To confirmed this, although the w values were close to one, their significance levels were all low because the various p-values of the Shapiro-Wilk test were lower than 0.05. To directly use appropriate value rather than just a mean value to replace missing data, Kalman smoothing imputation was used to account for temporal dependencies, seasonal effects, or spatial correlations. All of these are important for the malaria data analysis to each of the time series, which are State specific and recorded according to the LGAs per year. Square root transformation was applied before the Kalman imputation to replace missing values, which reduced the level of skewness and still preserved the integrity of the data patterns as much as possible. This was based on the preliminary statistical test carried out to check for the overall mean of the time series, the median, minimum value and maximum values. It was observed that the median and mean of Cross rivers State and Kebbi states was greatly differed from the maximum value. This called for closer observation of the time series of these State and not surprisingly, a single distinct value of 110022 and 2921 was observed in the MIP data of Cross Rivers State and Kebbi State. These were extremely huge, compared to the next largest values in Cross Rivers State and Kebbi State times series respectively, which was just in its 2 hundreds in unit. Thus this outliers where edited to be represented as 22 and 292 respectively. To avoid tampering with the data, only these two very obvious outlier points where edited before the application of the square root transform to the data.
The preprocessed time series was plotted as seen in Fig 6). Although values for the missing years (2021–2024) in the MIP data was generated for each LGA per year, before forecast, the decomposition of the MIP data and other analysis was limited to December 2020 of original available MIP data. Forecast of MIP records for the 2 years 2025 & 2027 were generated as two year points ahead are extrapolated forecast based on older data within the range between 2015 to 2020 records. Due to the lack of post-2020 inputs, the forecast results are thus advised to be interpreted with caution From the plotted original time series and the plotted preprocessed time series, it is obvious that the trends of the time series are preserved and missing values filled up with reasonable Kalman smoothened estimates. The range of the variance in the preprocessed data is thus better managed, to arrive at a good analysis in the end.
The time series plots were further decomposed as shown in Fig 7), to understand the long term trend in MIP cases over the years form the available data from January 2015–December 2020, separating inherent random noise in the observed time series data. At first glance of the preprocessed time series plot, one would say that the the trend in the general MC and MIP plot seem to follow similar pattern. The shape of the patterns seem not too different in the various States, with just some little variations in the fluctuating growth rate of cases per state. However, decomposing the data, showed clearly the huge variability inherent. From the plot, it could be seen that Cross Rivers State trend assumed a little peak at 5.7 in mid-2015, and then a downward turn to a trough towards the end of the first quarter of 2016. This low growth was characterized by some fluctuations of moving average around 5.6 up until the second quarter of 2017. A sharp trough below 5.6 was observed and immediately followed by a sharp peak reaching 6.2 within the first and third quarter of year 2017. Some slight troughs and fluctuations between the 4th quarter of 2017 to the end of third quarter of 2019. A steep trough was then seen which continued to steep till 2020. The MIP cases in Cross Rivers State showed the random noise mostly around -4 and 4 change in cases, with a sharp outlier at cases 8 in towards last quarter 2018.
Enugu State on the other hand started on a high peak in mid-2015, but a steep trough in 2016 and the lowest point attained within the fourth quarter of about 3.6 change in cases, which was sustained until the beginning of 2017. There was a minimum rise to peak of 4.5 within the third quarter of 2017 and fluctuating down back to a trough of about 4.3 again around the third quarter of 2018. This minimum fluctuation was sustained to the end of 2020. The MIP time series on the other hand had the random noise ranging between -3 and 3 through the years with some spikes of to 12 in third quarter of 2015 and around 5 cases in the mid of 2018. It was observed in Ondo State that the actual trend showed relatively steeps slope to a trough from mid-2015 in cases from 6.6, down to 5.6 by the beginning of 2016. This was sustained with minimal fluctuations up until end of 2016, where a rise in the fluctuations of additive growth going up to a peak of 6.5 by the first quarter of 2017. Before the end of this quarter, the trend began a downward trend through the years and reaching a minimum trough of about 5.0 by the end of 2020. The random noise within the MIP data ranged roughly within -5 and 5; with an distinct as high as 15 in cases in 2017.
The trend of MIP in Kebbi State started from a trough at moving average of 7.0 and gradually rose to 8.0 past midway of 2016, which was immediately followed by a fluctuation downward to a trough at 7.5 by the beginning of 2017. Before the end of the first quarter, a steady rise in the growth trend was resumed which kept rising with some minimal fluctuations to troughs, and then reaching a maximum peak of 9.5 towards the end of 2019. The MIP time series had the random noise ranging between -5 and 5 with two distinct peaks reaching point 14 and 12 at mid-2016 and mid 2018 respectively.
The trend of MIP in Niger State started in 2015 at a trough, rising and falling with little fluctuations between cases moving average 6.0 and 6.4, and moving back to trough of 5.8 by the middle of 2017. A gradual rise in the MIP trend began from this point up until mid 2019, with a record peak of about 8.0. The MIP time series had the random noise ranging between -5 and 5. MIP trend in Yobe State in mid-2015 was at trough of 9.5, with a little peak then back to the lowest trough recorded at in cases moving average of 8.8 by 2017. From this point, steady rise in the trends growth rate was sustained to 2020, reaching the highest peak recorded of 13.0. The MIP time series had the random noise ranging between -5 and 10, with a few spikes up to 20.
It was observed from the results (as shown in Table 4) that while Cross River State experienced its highest peak in MIP cases around 2017 and 2019, Enugu had its low within that same period. Although the North-Central (Niger State) recorded the lowest range in the trend of MIP cases among the Northern regions, ranging between 6.0 and 6.4; however the highest peak of the trends in the South-South, South-East, and South-West (6.5 cases) was observed to be less than the lowest possible trough of even the other North-East and North-West states (9.5 and 7.5 respectively). This indicates the higher prevalence of MIP in the northern regions of the country, which aligns with various reports on malaria prevalence in Nigeria, with the North accounting for most of the high-burden states. Varying presence of seasonal pattern was shown in all the time series data of the 6 states, with regular, repeating patterns in malaria cases that occurred at fixed intervals. With the different seasonal patterns shown by each region, serving as pointers to the fact that there are certain factors that are the major drivers of malaria transmission trend peculiar to each region. Thus to reduce MIP cases, a good under standing of the geographical location/zones and other peculiar exogenous factors occurrence such as socio-economic, seasonal practices and well as climatic and environmental factors , must be understood to achieve a good result. This variability shows the peculiar to their different spatial location, environmental and social practices in the different locations and the fact these play a huge role in the varying malaria transmission patterns observed.
Analysis of variability over time of MIP compared with general MC cases
The analysis of the variability of MIP cases was be approached by: i) analyzing how MIP cases vary with seasons of the year and compared to that of the general MC and ii) analyzing the cross-correlation between MIP and MC patterns over the years.
Analysis of malaria in pregnancy cases varying with seasons of the year.
From the time series analysis carried out to observe how MIP vary with the seasons of the year (as shown in Fig 8), it is observed in Cross rivers State that there is a repeated pattern in the change in MIP cases all round the year ranging between -6 and 3, however there were two highest peaks around the 3rd week June & beginning of September. On the other hand, MC cases all year round ranged between -30 and 20, with a peak around 18 in mid June.
Enugu State had a repeated pattern the MIP cases all round the year ranging between -4 and 5 of malaria cases change, with highest peak around 1st week July. The MC cases recorded changes within -15 and 10 with really no obvious peak shown except just slightly at end of July. Ondo State seemed to maintain a repeated pattern in the MIP cases all round the year between -5 to 10 of malaria cases change, with about 4 months having competitive high range of malaria cases and the highest around the 2nd weeks in July and August with a peak of 2. In the case of MC seasonal variation, a repeated pattern was observed all round the year with at peaks and troughs between -25 to 25, with about 4 months having competitive high range of malaria cases and the highest peak in the 2nd week July. Kebbi State MIP cases shows changes ranging between -10 and 5, with highest peak attained towards end of 1st week in October. The MC cases change on the other hand ranges between -40 and 40, with the highest peak attained in mid October.
Niger State showed a repeated pattern in the MIP cases all round the year between -5 to 8 of malaria cases change, with the highest peak around the Mid August at about 10. MC cases change ranged between -60 and 30, with the highest peak at point 10 also in Mid August. Yobe State had its repeated pattern of the MIP cases all round the year between -10 to 8 of malaria cases change, with about 3 months attaining competitive high peaks of malaria cases and the highest around the 4th weeks in October with a peak of 12. On the other hand the MC cases change range between -40 and 40, with the highest peak attaining point 60 in the 4th weeks in October.
Analysis of MIP showed to vary with seasons of the year (as summarized in Table 5), with the peak months for MIP cases recorded were: South-South (3rd week June & beginning of September) MC 3rd week October, South-East (1st week July) MC end of July, South-West (2nd weeks in July and August) MC 2nd week July, North-West (1st week October) MC mid October, North Central (Mid August) MC Mid August and North-East (3rd weeks in September and October) MC 3rd week October, respectively. It can be noted that South-south, South-West and North-East had their highest seasonal MIP peaks twice in a year, MIP cases in all regions experienced most of their peak MIP seasons before MC, except for North -central which had both MIP and MC peak season occur approximately same week (Mid August). These are pointers to the fact that to reduce MIP cases, a good understanding of the geographical location/zones and other peculiar seasonal occurrence such as socio-cultural seasonal practices and well as climatic and environmental factors, must be understood to achieve a good result. Also, intervention for MIP must be strategic, and specific to the pregnant women population, who in most cases experience their peak cases before the general MC cases. This variability shows the peculiar to their different spatial location, environmental and social practices in the different locations and the fact these play a huge role in the varying malaria transmission patterns observed.
Cross-correlation analysis of MIP and MC change over time.
The analysis of the change over time of MIP compared with the general MC cases, was done using Bi-Wavelet coherence (as shown in Fig 9, to Understand the correlation in occurrence of peaks of malaria cases in the 6 regions of Nigeria as the years go by. For the Bi-wavelet analysis of cross- correlation between MIP and MC cases over the years, the historical data between 2015–2020 is used due to the missing updated records on MIP cases. Instead of interpreting frequency directly in Hertz, the coherence cycles are translated into real-world malaria transmission periodicities [78,79]. Lower frequency bands (e.g., 8–32 Hz) reflect seasonal malaria trends, while higher frequencies (e.g., 128–256 Hz) capture short-term fluctuations driven by localized peaks or intervention effectiveness. In support with [81] the coherence between MIP and MC at these frequencies indicates synchronized malaria transmission patterns, of which MIP was seen to often preceding MC peaks.
In Cross River State, the coherence analysis indicated weak malaria endemicity, with low-frequency cycles (approximately 2–4 months) showing patchy but weak coherence throughout the year. However, intermittent coherence was observed between 2015–2020, particularly in short-term cycles (1–2 months) and seasonal variations (approximately 6–12 months) during specific years such as 2016, 2020, mid-2015 -2016, and 2018–2019. This suggested a partial relationship between MIP and MC, but their transmission peaks did not strongly synchronize. This implied that exogenous factors were likely influencing malaria trends. MIP led MC in transmission periods, implying a consistent yet indirect relationship. MIP cases could thus be early indicators of broader malaria transmission fluctuations. The high rainfall and strong river currents in Cross River State likely contributed to reducing mosquito breeding sites, which could limit malaria transmission compared to some other regions. Also, higher education levels based on civilization population among women in the region and thus strong adherence to IPTp may have helped sustain the State’s lower malaria burden.
Enugu State coherence spectrum showed strong periodic relationships at lower frequencies (approximately 6–8 months) persisting throughout the year with patches of coherence from 2015–2020. A short-term transmission cycle (2 months) observed in 2017 could be a pointer to localized synchronization between MIP and MC, but this pattern did not persist across all years. MIP seemed to precede MC transmission cycles, reinforcing a connection between pregnancy-related cases and broader malaria transmission patterns. Peaks were not fully synchronized, suggesting intervention timing and environmental factors could play a role. Enugu’s higher urbanization levels, education, and possibly strong preventive health measures could have helped in keeping malaria endemicity relatively low compared to northern States.
For Ondo State, the coherence analysis showed moderate malaria endemicity, with sustained high-frequency cycles (approximately 2–4 months) from mid-2016 to 2019. Additional coherence appeared at longer cycles (approximately 6–12 months) between 2016 and 2017, suggesting seasonal alignment between MIP and MC transmission trends. Urbanization, improved healthcare access, and higher education exposure may have contributed to malaria not being as severe compared to some other endemic regions.
Kebbi State showed strong malaria endemicity, with coherence observed across rapid malaria transmission cycles (approximately 1–3 months) from 2015 to 2020. Additional coherence at longer cycles (approximately 6 months) between 2016 and 2017 suggested seasonal variations in transmission. MIP consistently led MC, reinforcing pregnancy-related susceptibility in malaria transmission. However, peaks do not align, meaning other exogenous factors such as environmental or socioeconomic factors (IPTp adherence, healthcare access) influence the malaria trends. Socioeconomic disparities of women whose level of literacy and economic power of the majority may be below average, strong seasonal effects, and IPTp adherence could have contributed to higher endemicity in this region.
Malaria transmission in Niger State showed strong endemicity, with broad coherence observed across high-frequency cycles (approximately 2–4 months) from 2015–2020, including patches around longer seasonal cycles (approximately 6 months). MIP largely preceded MC transmission peaks, reinforcing the potential for MIP cases to serve as a leading predictor for malaria peaks. Peaks did not fully overlap, likely due to external intervention effects (IPTp adherence, rainfall patterns, socioeconomic access to treatment).
Yobe State showed significant malaria endemicity, with coherence observed in fast transmission cycles (approximately 2–3 months) between 2016–2019, and additional long-term trends (approximately 6 months) scattered throughout. MIP preceded MC cycles, confirming a consistent but non-synchronized relationship between pregnancy-related malaria and general malaria transmission. The strong malaria burden in Yobe likely resulted from socioeconomic disparities of women whose level of literacy and economic power of the majority is below average. Healthcare accessibility issues, and seasonal mosquito breeding patterns also play a role in this State. Coherence strength at different frequencies reflects how malaria in pregnancy (MIP) and malaria cases (MC) are synchronized across transmission cycles.
It should be noted that while States with higher numbers of LGAs (e.g., Niger: 26 LGAs, Kebbi: 22 LGAs) could have a denser dataset, this did not necessarily mean increased detection of high-frequency malaria trends. Rather, greater LGA coverage provided a more statistically robust representation of malaria incidence patterns across local populations. The presence of strong coherence at specific frequencies indicated alignment in MIP and MC transmission cycles, suggesting shared environmental or epidemiological drivers of malaria incidence. Seasonal trends, intervention timing, and fluctuations in malaria transmission played a more significant role in shaping coherence. After analyzing malaria trends across regions, the strongest coherence between MIP and MC occurred in short-term transmission cycles (approximately 2–4 months per cycle) in North-West (Kebbi), North-Central (Niger), and North-East (Yobe), followed by South-West (Ondo). Weaker coherence was observed in South-South (Cross River) and South-East (Enugu). This aligns with documented higher malaria prevalence in northern Nigeria, likely driven by lower IPTp adherence, socioeconomic disparities, and favorable environmental conditions for mosquito breeding. The consistent finding that MIP precedes MC transmission cycles suggests pregnancy-related malaria could serve as an early indicator of broader malaria transmission outcomes.
The cross correlation of change over time of MIP compared with the general MC cases in the populations of the different regions is summarized in Table 6), Coherence strength at different frequencies reflects how MIP and MC are synchronized across transmission cycles.
The high endemicity observed in the northern states is in agreement with the reports of high burden of malaria prevalence mostly in the northern states of Nigeria. It is also noticed that for most of the regions, MIP leads MC in peak, which is could be a pointer to the fact that as a vulnerable population, being the population firstly prone to the surge of malaria as the seasons of the year comes and goes. The consistent finding that MIP precedes MC transmission cycles suggests pregnancy-related malaria could serve as an early indicator of broader malaria transmission outcomes.
Forecast of future trends of MIP cases using an hybrid of Wavelet - ARIMA and SARIMAX models
Wavelet analysis and denoising of time series for specified change in malaria cases at amplitude (A = 7.99 e-04) and scale (J = 20) was applied after it was adjusted per time using Morlet wavelet. The denoising of the time series data of the original and then reconstructed (denoised data) was obtained as visualized in Fig 10, to optimize modelling process by minimizing noise in data for further analysis. The denoised data was then used to determine the initial ARIMA and SARIMAX model parameter values before the estimation of actual model parameters, analyzed for forecast of future trends. The starting and ending point (p,0,0 and 3,2,q respectively) for the analysis ARIMA models parameter estimation where q was based on the initial ARIMA parameter values, with p = 1 for all initial value. The SARIMAX also took the same format, with the seasonal parameter estimated also having its own start and end parameters (0,0,0 and 2,2,2 respectively).
MIP data from the NMDR basically had no records from 2021 till date, so to fill in the missing values the previously generated Kalman smoothened values used during preprocessing were adopted. In determining the stationarity of the time series for better analysis when using ARIMA or SARIMAX model, all the time series where observed to be stationary after applying the augmented dickey-fuller test. This analysis was done on the granular LGAs level data of the monthly cases per State. The initial SARIMAX parameters where determined based on the denoised time series of each State, to achieve better estimates of the ACF and PACF, in determining the initial AR-I-MA (p,d,q) parameters (as shown in Table 7.
With the initial model parameters set, the model for each States time series was estimated within a maximum range between the specified initial for each State (as summarized in Table 8, for which maximum of parameter p,d,q (3,2,q) where q is based on initial value form denoised data ACF) was used. Forecast of years February 2025 to January 2027 was then carried out using the best performing model. The analysis was done with the aid of the R statistical programming software. The accuracy of these forecast values were analyzed based on low AIC and based on evaluation matrices including minimized values of: BIC, RMSE,MAPE for model selection. Although it was observed in the SARIMAX model that most of the models subjected for selection had similar AIC and BIC, the selection of model was modified to pick model having least BIC, for different groups of models having same AIC. This made the selection of the top 3 best fitted models and finally the selection of the best fitted model achievable. The accuracy of the model was obtained and the best fit ARIMA or SARIMAX model was analyzed. The residual of best models were subjected to testing to determine the statistical properties of the model fitness.
After carrying out metric evaluation to determine the models’ goodness of fit alongside minimizing complexity of the model, the best model with the lower AIC values in relation to relatively low values of BIC, RMSE and MAPE were chosen. To test the residual of the selected ARIMA model, the study carried out check of the residual (as shown in Fig 11, Table 9) of the selected model for each state. This helped in assessing if the estimated model was able to capture adequate information from the data. This was achieved by testing the residual using the Jarque-Bera test for normality, LM test for ARCH (autoregressive conditional heteroskedasticity) effects, and LB-Q Test for Autocorrelation.
Kebbi and Niger States based on the tests conducted across the respective regions found residuals close to normality (p-value > 0.05), while Cross River, Enugu, Ondo, and Yobe States had the farthest values.
In both ARIMA and SARIMAX, significant ARCH effects (volatility clustering) were observed in Yobe State, reflecting strong fluctuations in MIP incidence. The SARIMAX model for Niger State exhibited notable volatility clustering, suggesting persistent variation in trends over time. Severe heteroskedasticity was identified in Ondo and Enugu States, highlighting considerable variations in MIP incidence, potentially driven by intervention-based fluctuations [80]. Cross River State displayed moderate signs of volatility clustering, with its SARIMAX model showing substantial deviations in residual behavior. Kebbi State, on the other hand, demonstrated the most stable trend, with the SARIMAX model exhibiting residuals closest to normality, indicating minimal volatility clustering. Overall, the results confirm that Northern states (especially Yobe and Niger) show strong volatility, reinforcing the patterns observed in malaria prevalence studies, while Southern states display intervention-driven fluctuations with Cross River showing moderate variability.
Furthermore, all regional residuals exhibited strong autocorrelation, persisting even after multiple model adjustments aimed at improving fit [26]. In Niger State, autocorrelation in the SARIMAX model was significantly greater than in ARIMA, highlighting persistent dependencies in malaria incidence trends. Kebbi State’s SARIMAX residuals, while displaying structured patterns, suggested potential underlying malaria transmission dynamics that remain unexplained [24].
Although SARIMAX results indicated no clear seasonal component, however similar to ARIMA, comparisons between the two models reveal variations in statistical properties across regions, influenced by differing exogenous factors [42]. While no strong seasonality was detected, fluctuations in malaria incidence and residual variability may be driven by external influences such as climate conditions, healthcare accessibility, intervention effectiveness, or migration patterns [82]. It is observed that Kebbi and Niger states exhibited the most stable residuals, suggesting minimal seasonal effects, whereas Yobe, Enugu, and Ondo demonstrated significant heteroskedasticity, potentially due to intervention-based or environmental fluctuations.
Unresolved autocorrelation indicates that malaria treatments could be better timed to align with recurring transmission cycles [83]. In line with the research objective of understanding the variability of MIP across different Nigerian states, the comparative findings from time series analysis highlight important regional differences in malaria incidence, helping to inform more effective malaria control strategies targeted at pregnant women.
All SARIMAX models offered deeper insight into MIP transmission cycles and incidence trends across regions, although for Kebbi State, the simpler ARIMA model provided a better fit. This implied that: (i) in Cross River State, seasonality may not be a dominant factor, but other exogenous variables (such as environmental or intervention effects) are needed to improve model optimization; (ii) malaria cases fluctuate due to external factors in Enugu; therefore, it is important to incorporate weather trends, intervention coverage, and socioeconomic factors when building epidemiological models; (iii) Ondo State indicates MIP incidence was no stable; thus models would work best if meaningful exogenous factors, such as seasonal interventions are included; (iv) Niger State exhibited strong seasonality or intervention-driven fluctuations; therefore, climatic factors should be accounted for; and (v) MIP cases followed strong cycles in Yobe State, confirming the presence of consistent transmission cycles that influence malaria trends in the region.
From the plotted ACF (as shown in Fig 12, it could be concluded that the residuals’ ACF plots show good fitness of models. The study thus proceeded with the forecast. It should be mentioned that a comparison of Niger State SARIMAX forecast(2900 cases) and its ARIMA forecast (210 cases). This demonstrates that the SARIMAX model predictions experience a significant increase in cases in Niger which could be as a result of intervention decay effects, potentially leading to an overestimation of the malaria burden by 2027. Also, sensitivity to exogenous variables may be exaggerating the prediction, as demonstrated by comparative study with ARIMA, which calls for careful interpretation of long-term forecasts. Additional intervention modeling improvements, including modifying decay parameters could increase prediction stability and guarantee realistic scenario planning. Also, prediction stability and accuracy may be increased by optimizing exogenous inputs or investigating alternative modeling techniques, such as hybrid epidemiological models that integrate climate and intervention dynamics. These are areas we intend to apply in further research. Thus using SARIMAX model for all other States and ARIMA model for Kebbi and Niger State was adopted. This approach ensures the best statistical fit and predictive accuracy based on findings.
The forecast results (as shown in Fig 13) represented each regions’ expected number of malaria cases for future periods until January 2027, based on historical data.
It was observed in these regions, that the actual number of cases estimated within a 95% confidence interval could reach possible cases by the end of 2027, as high as: 90 in South-South (Cross Rivers State), 95 in the South-East (Enugu), 125 in South-West (Ondo State), 240 in North-West (Kebbi State), 210 in North Central (Niger State) and about 400 in North-East (Yobe State), respectively.
These forecast results are extrapolated and should be interpreted with caution due to lack of original post-2020 inputs for MIP cases. Based on the facts available, to reduce malaria cases and achieve elimination in the various regions, there is need to double efforts in the control interventions deployed while targeting the seasonal peak periods of MIP cases peculiar for each region. More attention should also be given to prioritizing the North-East and North-West so as to reduce the population of pregnant women being prone to malaria in the nation.
Implication for malaria control efforts in Nigeria
Based on the above results, these findings reveal significant regional variability in malaria in pregnancy (MIP) incidence, emphasizing the need for targeted interventions across Nigeria’s six geopolitical zones. South-South (Cross River State) peak incidence in 2017 and 2019 suggests potential gaps in malaria prevention strategies, such as intermittent preventive treatment in pregnancy (IPTp) coverage and insecticide-treated net distribution. While malaria control efforts were ongoing, climatic conditions, river flooding, and low accessibility to antenatal care services may have contributed to increased MIP cases. Moderate residual autocorrelation suggests some underlying patterns that remain unresolved in transmission dynamics, reinforcing the need for better maternal health programs and expanded malaria surveillance to mitigate future spikes.
South-East (Enugu State) with the lower troughs recorded in 2017, malaria incidence among pregnant women appears more stable compared to other regions. This stability may stem from better urban healthcare access, higher IPTp coverage, and lower transmission intensity. However, significant heteroskedasticity observed in residuals indicates persistent volatility in malaria trends, pointing to the need for socioeconomic interventions and improved record-keeping at local health centers to enhance malaria burden assessments.
South-West (Ondo State) strong autocorrelation and ARCH effects observed in residuals suggest periodic outbreaks and intervention-driven fluctuations, highlighting that seasonal interventions directly influence transmission cycles. The presence of short-term cycles confirms the instability of MIP incidence, requiring enhanced monitoring of intervention effectiveness, IPTp adherence, and health facility utilization to minimize malaria variability.
North-West (Kebbi State) malaria transmission displays long-term coherence trends, reinforcing endemicity and seasonality-driven outbreaks. ARIMA performed better than SARIMAX, but persistent autocorrelation suggests some unresolved malaria transmission dynamics. Improved intervention timing and adherence to IPTp uptake among pregnant women should be prioritized, alongside expanded community-based malaria monitoring to support early detection of seasonal surges.
North-Central (Niger State) exhibits strong ARCH effects, indicating hidden transmission patterns possibly linked to seasonality or intervention coverage gaps. While MIP incidence appears stable, short-term fluctuations suggest the need for climate-sensitive forecasting models to adapt malaria response efforts based on rainfall trends, temperature variations, and vector density cycles. Projected peaks in MIP cases by 2027, the data highlight an urgent need for aggressive malaria control strategies
North-East (Yobe State) with projected peaks in MIP cases by 2027, also highlight an urgent need for aggressive malaria control strategies. Strong cyclic trends and autocorrelation indicate malaria seasonality as a primary transmission driver, suggesting interventions need to align with peak cycles rather than rely on static treatment schedules. Expanded healthcare access, climate adaptation measures, and localized malaria surveillance will be crucial in reducing MIP burden.
The findings emphasize the necessity of region-specific interventions, particularly in high-burden states.
South-South and South-East require healthcare accessibility improvements, with Cross River exhibiting moderate residual patterns and Enugu showing strong volatility in malaria incidence.
North-Central and North-West need better integration of seasonality-based forecasting tools, particularly in Niger, where strong ARCH effects were detected, and Kebbi, which requires improved intervention timing despite ARIMA’s better fit.
North-East demands urgent malaria control measures, reinforcing the need for strengthened antenatal malaria monitoring and targeted intervention deployment.
Study limitations
This study provides valuable insights into malaria in pregnancy (MIP) incidence trends across Nigeria’s six geopolitical zones. By applying wavelet coherence and time series modelling, regional variations were identified, and future transmission patterns were forecasted. There are several limitations that must be acknowledged to ensure proper interpretation and application of these findings. Data collection biases in health facility-based reporting may have underestimated true malaria incidence, as cases occurring after the COVID-19 pandemic (post-2021 data) were not captured. Also, variations in data quality across regions and disparities in record-keeping infrastructure may have influenced accuracy and model reliability.
The use of ARIMA and SARIMAX models assumed stationarity, yet malaria transmission cycles may be influenced by nonlinear climatic trends, potentially affecting model precision. Future research we intend to embark hybrid epidemiological models incorporating machine learning techniques to improve prediction accuracy and adaptation to dynamic transmission factors. This will help us achieve increased prediction stability and accuracy were hybrid epidemiological models that integrate climate and intervention dynamics will be analysed. These are areas we intend to apply in further research. Wavelet coherence analysis successfully identified short- and long-term malaria transmission periodicities, but it did not point out causal mechanisms, indicating the need for further epidemiological studies to establish direct links between interventions, environmental factors, and transmission peaks.
Potential overfitting in regional models was observed, particularly through ARCH effects and autocorrelation persistence, suggesting that unaccounted exogenous factors such as IPTp intervention coverage, climatic influences, or socioeconomic disparities may be driving MIP transmission trends. These aspects should be investigated in future studies to refine forecasting models and intervention strategies.
Despite these limitations, this study provides a comprehensive analysis of MIP transmission trends, offering evidence-based recommendations for malaria intervention strategies tailored to regional transmission patterns.
Conclusion
Malaria in pregnancy (MIP) remains a significant public health concern in Nigeria, contributing to maternal morbidity, adverse birth outcomes, and increased neonatal mortality. The high rate of malaria-related deaths in pregnant women poses an urgent challenge, as maternal fatalities and stillbirths represent irrecoverable losses to families and communities.
This study successfully addressed the question of long-term trends in MIP incidence over the past decade (Jan 2015—Jan 2025). Findings revealed substantial variability in malaria incidence across regions, such as the South-South (Cross River State) reaching its highest peak in MIP cases in 2017, while the South-East (Enugu) experienced its lower trough in the same year. The highest observed peaks in South-South, South-East, and South-West (6.5 change in malaria incidence) were lower than the lowest recorded troughs in all the Northern regions. This reinforces the need for cost-effective, region-specific malaria interventions, considering the distinct transmission dynamics and environmental influences in each geopolitical zone.
The second research question was addressed by analyzing MIP case trends compared to general malaria case (MC) trends across different regions. Findings revealed strong cross-correlations over an extended period at high frequency cycles, with North-West (Kebbi) exhibiting the strongest correlation, followed by North-Central (Niger), North-East (Yobe), and South-West (Ondo). In contrast, South-South (Cross River) and South-East (Enugu) showed weaker but consistent correlation patches at similar frequencies. The peak months recorded for MIP cases recorded were: South-South (3rd week June & beginning of September), South-East (1st week July), South-West (2nd weeks in July and August), North-West (1st week October), North Central (Mid August) and North-East (4th weeks in October), respectively. The seasonal persistence of MIP cases and their strong correlation with general malaria transmission trends highlights malaria’s endemicity in Nigeria. Also, while the MIP population is proportionally smaller compared to the general population, the increase in maternal mortality due to malaria has far-reaching effects on public health and neonatal survival, demanding strategic interventions specific to pregnant women. Notably, peak MIP cases often precede peak general malaria incidence, emphasizing the unique vulnerability of pregnant women and the need for timely intervention strategies tailored to their specific risk factors.
Forecasting results confirmed significant heterogeneity in MIP trends, with each state requiring a customized SARIMAX model approach. Persistent autocorrelation in multiple models suggests that seasonality and external environmental factors continue to influence malaria transmission cycles. In particular, Cross River State exhibited strong ARCH effects, reinforcing the role of seasonal fluctuations in malaria burden. Future projections indicate potential peaks in MIP cases by 2027 in the Northern regions emphasizing the urgent need to prioritize interventions in these high-burden areas.
Pre-analysis of data obtained from NMDR has provided valuable insights into how MIP cases compare with general MC cases over time (2015–2024), identifying seasonal variations and potential future trends. Findings clearly indicate strong regional variability in malaria transmission patterns, emphasizing that each geopolitical zone experiences unique transmission dynamics. Given Nigeria’s endemic malaria status,improved monitoring and coordinated supervision of MIP data collection and reporting are crucial for accurate modelling and prediction analysis to support effective intervention strategies.
The Nigerian government has implemented multiple malaria control strategies, but eliminating malaria will require more targeted, region-specific approaches that address the unique variability observed in MIP cases. This study confirms that malaria incidence differs significantly by region, with Northern states consistently recording higher cases. To achieve the Nigeria Malaria Strategic Plan (NMSP) 2021–2025 goal of reducing malaria prevalence to below 10%, informed regional strategies must be prioritized. Applying tailored intervention approaches for MIP control will enhance decision-making for intervention deployment, improve planning efforts, and optimize malaria burden reduction nationwide.
This study faced data limitations, particularly missing MIP records post-2021, which may affect forecast reliability. Also while ARIMA and SARIMAX models assumed stationarity, unaccounted nonlinear climatic trends and external intervention factors may have influenced malaria transmission dynamics. Also, persistent autocorrelation and ARCH effects indicate unaccounted exogenous factors—such as IPTp coverage, climate variability, and socioeconomic disparities—that may be driving MIP transmission trends. Also the sensitivity of SARIMAX model to exogenous factors among other limitations earlier mentioned, highlight the need for deeper epidemiological and machine learning studies to achieve better prediction results. This is an area of further study to better prediction stability and accuracy to optimize exogenous inputs when integrating climate and intervention dynamics.
It is recommended that to improved malaria control, strengthening of malaria record-keeping in primary health facilities, ensuring consistent data reporting and quarterly monitoring is crucial. Improved IPTp coverage and targeted interventions and emphasis for ANC compliance, especially in high-risk regions like in Northern regions should be campaigned, given that MIP trends suggest early warning signs of malaria surges. Enhanced malaria surveillance systems, incorporating real-time data tracking to optimize intervention timing and effectiveness are also needed. Prioritizing seasonal intervention efforts is a must, to align malaria control programs with peak transmission months in each region. Integrating forecast-driven malaria control measures to significantly reduce MIP burden and help Nigeria achieve its National Malaria Elimination Programme (NMEP) objectives.
Acknowledgments
The authors thanks the handling editor and reviewers for their valuable comments, which greatly improved the quality and content of this paper. Our special appreciation goes to NMEP for the provision of access to the data used to carry out this study.
ICAMMDA Team Members:
Emmanuel Afolabi Bakare.2,4,☯, Afeez Abidemi 4,6,☯, Deborah Oluwatobi Daniel 4,☯, Dolapo Oluwaseun Oniyelu1,4,☯, Aaron Onyebuchi Nwana 4,7,☯, Oluwaseun Akinlo Mogbojuri 2,4,8,☯, Idowu Isaac Olasupo 2,4,☯, Samuel Abidemi Osikoya 2,4,☯
6 Department of Mathematical Sciences, Federal University of Technology, Akure, Ondo State, Nigeria,
7 Department of Animal and Environmental Biology, Federal University Oye-Ekiti, Ekiti State, Nigeria,
8 Department of Mathematical Sciences, Adekunle Ajasin University, Akungba-Akoko, Ondo State, Nigeria
emmanuel.bakare@fuoye.edu.ng (Director, ICAMMDA)
References
- 1. Eniola K, Okolo B, Ayeni E, Abolade S, Ugwu S, Awoyinka T, et al. Malaria endemicity in sub-Saharan Africa: Past and present issues in public health. Popul Med. 2023;5(Supplement).
- 2. Orok AB, Ajibaye O, Aina OO, Iboma G, Oboshi SA, Iwalokun B. Malaria interventions and control programes in Sub-Saharan Africa: a narrative review. Cogent Med. 2021;8(1):1940639.
- 3. Awosolu OB, Yahaya ZS, Farah Haziqah MT. Prevalence, parasite density and determinants of falciparum malaria among febrile children in some Peri-Urban communities in southwestern nigeria: a cross-sectional study. IDR. 2021;14:3219–32.
- 4. Shulman CE, Dorman EK. Importance and prevention of malaria in pregnancy. Trans Roy Soc Tropic Med Hygiene. 2003;97(1):30–5.
- 5.
World Health Organization. WHO policy brief for the implementation of intermittent preventive treatment of malaria in pregnancy using sulfadoxine-pyrimethamine (IPTp-SP). World Health Organization; 2014.
- 6. González R, Manun’Ebo MF, Meremikwu M, Rabeza VR, Sacoor C, Figueroa-Romero A. Community delivery of IPTp: coverage in four sub-Saharan countries. Lancet Glob Health. 2023;11(4):e566–74.
- 7.
National Malaria Elimination Programme (NMEP), National Population Commission (NPC), ICF. Nigeria Malaria Indicator Survey 2021: Final Report. Abuja, Nigeria, and Rockville, Maryland, USA: NMEP, NPC, and ICF; 2022.
- 8.
World Health Organization. World Malaria Report 2018. Geneva, Switzerland: World Health Organization; 2019.
- 9.
World Health Organization. Report on Malaria in Nigeria 2022. Geneva, Switzerland: World Health Organization; 2023.
- 10.
National Malaria Elimination Programme, National Population Commission, National Bureau of Statistics, ICFI. Nigeria Malaria Indicator Survey 2015 : Final Report. Abuja, Nigeria: National Malaria Elimination Programme; 2016. https://dhsprogram.com/publications/
- 11.
National Malaria Elimination Programme (NMEP). National Malaria Strategic Plan 2021–2025. Abuja, Nigeria: Federal Ministry of Health, Nigeria; 2021. https://nmcp.gov.ng/
- 12. Adegbenro W, Oni ET, Oba-Ado O, Folayan WA, Olafimihan O, Arowolo T, et al. Replacement campaign of LLINs in Ondo State, Nigeria. Divers Equal Health Care. 2018;15(3):95–103.
- 13.
National Federal Ministry of Health. Adoption of Intermittent Preventive Treatment in Pregnancy (IPTp). 2005.
- 14. Agyeman YN, Newton SK, Annor RB, Owusu-Dabo E. Revised IPTp-SP effectiveness in Northern Ghana. J Trop Med. 2020;2020(1):2325304.
- 15. Mpogoro FJ, Matovelo D, Dosani A, Ngallaba S, Mugono M, Mazigo HD. IPTp uptake and outcomes in Geita district, Tanzania. Malaria J. 2014;13:1–14.
- 16. Apinjoh TO, Ntui VN, Chi HF, Moyeh MN, Toussi CT, Mayaba JM. IPTp-SP and sub-microscopic malaria in Cameroon. PLoS One. 2022;17(9):e0275370.
- 17. Okeibunor JC, Orji BC, Brieger W, Ishola G, Otolorin E, Rawlins B. Community-directed interventions to prevent malaria in pregnancy: Akwa Ibom case study. Malaria J. 2011;10:1–10.
- 18. Bhalla D, Cleenewerck L, Okorafor KS, Gulma KA. Malaria prevention survey among pregnant women in Nnewi, Nigeria. Sci World J. 2019;2019(1):6402947.
- 19. Okafor IP, Ezekude C, Oluwole EO, Onigbogi OO. Knowledge, perception, and prevention of malaria in pregnancy in Nigeria. J Fam Med Prim Care. 2019;8(4):1359–64.
- 20. Muhammad HU, Giwa FJ, Olayinka AT, Balogun SM, Ajayi I, Ajumobi O, et al. Malaria prevention and delivery outcomes in Northeastern Nigeria. Malaria J. 2016;15:1–6.
- 21. Balami AD, Said SM, Zulkefli NAM, Norsa’adah B, Audu B. Health education to improve malaria preventive practices and pregnancy outcomes. Malaria J. 2021;20:1–16.
- 22. Kamau A, Mogeni P, Okiro EA, Snow RW, Bejon P. A systematic review of changing malaria disease burden in sub-Saharan Africa since 2000 : comparing model predictions and empirical observations. BMC Med. 2020;18(1):94.
- 23. Bauserman M, Conroy AL, North K, Patterson J, Bose C, Meshnick S. An overview of malaria in pregnancy. Semin Perinatol. 2019;43(5):282–90.
- 24. Homan T, Maire N, Hiscox A, Di Pasquale A, Kiche I, Onoka K, et al. Spatially variable risk factors for malaria in a geographically heterogeneous landscape, Western Kenya: an explorative study. Malar J. 2016;15(1):1-15.
- 25.
Teboh-Ewungkem MI, Prosper O, Gurski K, Manore CA, Peace A, Feng Z. Intermittent preventive treatment (IPT) and the spread of drug resistant malaria. Applications of dynamical systems in biology and medicine. Springer; 2015. p. 197–233.
- 26. Assumpta Komugabe M, Caballero R, Shabtai I, Yi Z, Dodds Z. Geospatial and path analysis for enhancing Malaria control and primary healthcare delivery in low-income nations: a case study of Uganda. AJEID. 2024;12(3):44–54.
- 27. Esling P, Agon C. Time-series data mining. ACM Comput Surv. 2012;45(1):1–34.
- 28. Sacchi L, Larizza C, Combi C, Bellazzi R. Data mining with temporal abstractions: learning rules from time series. Data Min Knowl Discov. 2007;15:217–47.
- 29. Liu Z, Zhu Z, Gao J, Xu C. Forecast methods for time series data: a survey. IEEE Access. 2021;9:91896–912.
- 30.
Huang HS, Liu CL, Tseng VS. Multivariate time series early classification using multi-domain deep neural network. In: 2018 IEEE International Conference on Data Science and Advanced Analytics (DSAA). 2018. p. 90–8.
- 31.
Steehouwer H. A frequency domain methodology for time series modelling. Interest rate models... sovereign wealth funds. Springer; 2010. p. 280–324.
- 32. Chiapinotto S, Sarria EE, Mocelin HT, Lima JAB, Mattiello R, Fischer GB. Impact of non-pharmacological initiatives for COVID-19 on pediatric respiratory illness admissions. Paediatr Respir Rev. 2021;39:3–8.
- 33. Janowicz K, Gao S, McKenzie G, Hu Y, Bhaduri B. GeoAI: spatially explicit artificial intelligence techniques for geographic knowledge discovery and beyond. Int J Geograph Inf Sci. 2019;34(4):625–36.
- 34.
Kohli PS, Arora S. Application of machine learning in disease prediction. In: 2018 International Conference on Computing, Communication and Automation (ICCCA), 2018. p. 1–4.
- 35. Loftus TJ, Filiberto AC, Leeds IL, Donoho D. Machine learning in clinical decision-making. Frontiers Media SA. 2023.
- 36. Bui Q-T, Nguyen Q-H, Pham VM, Pham MH, Tran AT. Understanding spatial variations of malaria in Vietnam using remotely sensed data integrated into GIS and machine learning classifiers. Geocarto Int. 2018;34(12):1300–14.
- 37. Audebert N, Le Saux B, Lefevre S. Deep learning for classification of hyperspectral data: a comparative review. IEEE Geosci Remote Sens Mag. 2019;7(2):159–73.
- 38. Zhu AX, Lu G, Liu J, Qin CZ, Zhou C. Spatial prediction based on third law of geography. Ann GIS. 2018;24(4):225–40.
- 39. Wang S, Cao J, Yu PS. Deep learning for spatio-temporal data mining: a survey. IEEE Trans Knowl Data Eng. 2022;34(8):3681–700.
- 40. Zheng N, Du S, Wang J, Zhang H, Cui W, Kang Z, et al. Predicting COVID-19 in China using hybrid AI model. IEEE Trans Cybern. 2020;50(7):2891–904.
- 41. Chae S, Kwon S, Lee D. Predicting infectious disease using deep learning and big data. IJERPH. 2018;15(8):1596.
- 42. Kumar P, Kalita H, Patairiya S, Sharma YD, Nanda C, Rani M. Forecasting COVID-19 dynamics in top 15 countries using ARIMA ML. MedRxiv. 2020:2020-03.
- 43. Anastassopoulou C, Russo L, Tsakris A, Siettos C. Modelling and forecasting the COVID-19 outbreak. PLoS One. 2020;15(3):e0230405.
- 44.
Gupta A, Kumar A. Mid term daily load forecasting using ARIMA, wavelet-ARIMA, machine learning. In: 2020 IEEE International Conference on Environment and Electrical Engineering and 2020 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I&CPS Europe). 2020. p. 1–5. https://doi.org/10.1109/eeeic/icpseurope49358.2020.9160563
- 45. Apley DW, Zhu J. Visualizing predictor variable effects in black box models. J R Stat Soc Ser B. 2020;82(4):1059–86.
- 46. Azodi CB, Tang J, Shiu SH. Interpretable machine learning for geneticists. Trends in Genetics. 2020;36(6):442–55.
- 47. Fattah J, Ezzine L, Aman Z, El Moussami H, Lachhab A. Forecasting of demand using ARIMA model. Int J Eng Bus Manag. 2018;10:1847979018808673.
- 48. Kownacki C. Optimization approach to adapt Kalman filters for... accelerometer and gyroscope signals’ filtering. Digit Signal Process. 2011;21(1):131–40.
- 49.
Akujuobi CM. Wavelets and wavelet transform systems and their applications. Berlin/Heidelberg, Germany: Springer; 2022.
- 50. Amadi GD, Biu OE, Arimie CO. Univariate and vector autocorrelation time series models for some sectors in Nigeria. Glob J Sci Front Res: Math Decis Sci. 2020;20(6):56–81.
- 51. Iqbal A, Amin R, Alsubaei FS, Alzahrani A. Anomaly detection in multivariate time series data using deep ensemble models. PLoS ONE. 2024;19(6):e0303890.
- 52. Motamedi M, Dawson J, Li N, Down DG, Heddle NM. Demand forecasting for platelet usage: from univariate time series to multivariable models. PLoS ONE. 2024;19(4):e0297391.
- 53.
Hyndman RJ, Athanasopoulos G. Forecasting: principles and practice. OTexts; 2018.
- 54.
Koopmans LH. The spectral analysis of time series. Elsevier; 1995.
- 55. Liu B, Li Z, Li Z, Chen C. CL-Informer: Long time series prediction model based on continuous wavelet transform. PLoS ONE. 2024;19(9):e0303990.
- 56. Guo T, Zhang T, Lim E, Lopez-Benitez M, Ma F, Yu L. A review of wavelet analysis and its applications: challenges and opportunities. IEEE Access. 2022;10:58869–903.
- 57. Priyadarshani N, Marsland S, Castro I, Punchihewa A. Birdsong denoising using wavelets. PLoS ONE. 2016;11(1):e0146790.
- 58. Xu Y, Liu Z, Zhao J, Su C. Weibo sentiments and stock return: a time-frequency view. PLoS ONE. 2017;12(7):e0180723.
- 59. Jiang X, Shen W. Simultaneous denoising and heterogeneity learning for time series data. Stat Biosci. 2023:1–16.
- 60. Singh S, Parmar KS, Kumar J, Sidhu JS. Hybrid wavelet ARIMA model for forecasting COVID-19 casualties. Chaos Solitons Fractals. 2020;135:109866.
- 61. Rubio L, Palacio Pinedo A, Mejía Castaño A, Ramos F. Forecasting volatility by using wavelet transform, ARIMA and GARCH models. Eurasian Econ Rev. 2023;13(3–4):803–30.
- 62. Sperandio Nascimento EG, Ortiz J, Furtado AN, Frias D. Using discrete wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks. PLoS ONE. 2023;18(4):e0282621.
- 63. Lei Z. Statistical analysis of COVID-19 trends: ARIMA, regression, and spatial models. medRxiv. 2024;2024:2024–10.
- 64. Sangeda RZ, James D, Mariki H, Mbwambo ME, Mwenesi ME, Nyaki H, et al. Childhood vaccination trends during 2019 to 2022 in Tanzania and the impact of the COVID-19 pandemic. Human Vaccines Immunotherap. 2024;20(1):2356342.
- 65.
National Malaria Data Repository. National Malaria Data Repository - Contact us@ nmdrhelpdesk@nmep.gov.ng [cited in 2025 May 22]. https://nmdrnigeria.ng/dhis-web-commons/security/login.action
- 66.
GADM. GADM Maps, Data and Version X. [cited 2024 ]. https://gadm.org
- 67. Sun J, Xia Y. Pretreating and normalizing metabolomics data for statistical analysis. Genes Dis. 2024;11(3):100979.
- 68. Roza A, Violita ES, Aktivani S. Inflation study using unit root tests: Bukittinggi City 2014 –2019. EKSAKTA. 2022;23(2):106–16.
- 69. Goodell JW, Goutte S. Co-movement of COVID-19 and Bitcoin: evidence from wavelet coherence analysis. Finance Res Lett. 2021;38:101625.
- 70. Vacha L, Barunik J. Co-movement of energy commodities revisited: evidence from wavelet coherence analysis. Energy Econ. 2012;34(1):241–7.
- 71. Rhif M, Ben Abbes A, Farah IR, Martínez B, Sang Y. Wavelet transform application for/in non-stationary time-series analysis: a review. Appl Sci. 2019;9(7):1345.
- 72.
Sundararajan D. Fourier analysis—a signal processing approach. Singapore: Springer; 2018.
- 73.
Gomes J, Velho L. From Fourier analysis to wavelets. New York: Springer; 2015.
- 74. Lepik U. Numerical solution of differential equations using Haar wavelets. Math Comput Simul. 2005;68(2):127–43.
- 75. Katoch R, Sidhu A. An application of ARIMA model to forecast the dynamics of COVID-19 epidemic in India. Glob Bus Rev. 2021;:0972150920988653.
- 76. Belle V, Papantonis I. Principles and practice of explainable machine learning. Front Big Data. 2021;4:688969.
- 77. Attai K, Asuquo D, Okonny KE, Johnson EA, Bassey A, John A, et al. Sentiment analysis of Twitter discourse on the 2023 Nigerian general elections. Eur J Comput Sci Inf Technol. 2024;12(4):18–35.
- 78. Traoré N, Millogo O, Sié A, Vounatsou P. Impact of climate variability and interventions on Malaria incidence and forecasting in Burkina Faso. IJERPH. 2024;21(11):1487.
- 79. Pascual M, Cazelles B, Bouma MJ, Chaves LF, Koelle K. Shifting patterns: malaria dynamics and rainfall variability in an African highland. Proc R Soc B. 2007;275(1631):123–32.
- 80.
Rehal V. Heteroscedasticity: causes and Consequences. [cited 2025 May 22]. https://spureconomics.com/heteroscedasticity-causes-and-consequences/
- 81. Dellicour S, Tatem AJ, Guerra CA, Snow RW, ter Kuile FO. Quantifying the number of pregnancies at risk of Malaria in 2007 : a demographic study. PLoS Med. 2010;7(1):e1000221.
- 82.
Abdulkadir AA. Predicting malaria incidence in Kenya using ARIMA and SARIMA models. Strathmore University Research Repository; 2020. https://su-plus.strathmore.edu/server/api/core/bitstreams/0d99d4f2-f1bc-40ac-bd86-9a7d696cdb71/content
- 83. Torrence C, Compo GP. A practical guide to wavelet analysis. Bull Am Meteorol Soc. 1998;79(1):61–78.