Re-analyzing the SARS-CoV-2 series using an extended integer-valued time series models: A situational assessment of the COVID-19 in Mauritius

This paper proposes some high-ordered integer-valued auto-regressive time series process of order p (INAR(p)) with Zero-Inflated and Poisson-mixtures innovation distributions, wherein the predictor functions in these mentioned distributions allow for covariate specification, in particular, time-dependent covariates. The proposed time series structures are tested suitable to model the SARs-CoV-2 series in Mauritius which demonstrates excess zeros and hence significant over-dispersion with non-stationary trend. In addition, the INAR models allow the assessment of possible causes of COVID-19 in Mauritius. The results illustrate that the event of Vaccination and COVID-19 Stringency index are the most influential factors that can reduce the locally acquired COVID-19 cases and ultimately, the associated death cases. Moreover, the INAR(7) with Zero-inflated Negative Binomial innovations provides the best fitting and reliable Root Mean Square Errors, based on some short term forecasts. Undeniably, these information will hugely be useful to Mauritian authorities for implementation of comprehensive policies.


Introduction
In early March 2021, Mauritius was struck by a second wave of the Novel Coronavirus 2019 (COVID-19) pandemic among the local community after officially recording a long sequence of zero locally acquired active cases. In fact, it is worth to mention that during the first wave of the COVID-19 pandemic, and especially after detecting the first set of local cases, on 18  2020, Mauritius implemented timely strict sanitary measures in terms of national lockdown, safe shopping guidelines, mandatory face covering in public places, minimal public gathering followed by COVID-19 related legislations like the Quarantine Bill and COVID-19 Miscellaneous Bill [1]. Moreover, as a pro-active strategy, the vaccination campaigns among front-liners in the disciplined forces and health sectors, kick-started in January 2021. A variety of vaccines notably Oxford-AstraZeneca/Covishield, Covaxin and Sinopharm were obtained from country partners and consequently, the targeted audience for vaccination expanded, covering old-aged persons, people with comorbidities, and personnel working in the education, retail and other economic sectors. As at May 2021, around 18 percent of the total Mauritian population has already received the first jab of the vaccine. This process is still ongoing with aim to vaccinate at least 60% of the population in general, by end of July 2021 thus attaining herd-immunity before opening of the frontier. Even the vaccination exercise for the second doses has already started and has been running successfully. The second wave which was of a sporadic transmission mode based on some identifiable clusters, was immediately controlled by the authorities. The Ministry of Health and Wellness accelerated the contact tracing exercise and to contain the virus more rapidly, law enforcers implemented novel localised mobility restrictions in regions ("red-zoned" areas) with large number of contaminations under the Temporary Restrictions of Movement Order. In terms of evidenced based policies, Mauritius is indeed well positioned but on the other hand, the uncommon patterns in the COVID-19 series raise some concerns in the research community especially in the midst of statistics and data analytics. The COVID-19 new cases series in Mauritius has some distinctive features like a purely unique serial trend with excess of zeros and some oscillations, leading to over-dispersion, while the corresponding COVID-19-related death series describes a preponderance of excess zeros. These series thus imply that the simple integer-valued auto-regressive model (INAR (1)) with Poisson or extra-Poisson innovations is surely insufficient in this context [2][3][4] and ignoring the excess of zeros will lead to biasedness in the estimated parameters and standard errors [5]. To remedy, this paper proposes to construct a novel high-ordered integer-valued auto-regressive process (INAR) with Zero-Inflated (ZI) innovation distributions. This novel construction bridges two important gaps. Firstly, as seen in the literature, the ZI models have extensively been used in regression contexts only (See [6,7] and the references therein) while its applications in counting time series modelling is quite restricted to first order only [8][9][10][11][12][13][14][15][16][17].
Secondly, the proposed time series model allows for covariate specification, which in the context of the COVID-19 analysis, is primordial. In fact, it is important to identify the significant factors contributing to the propagation of SARS-CoV-2 in the local community, while also detecting the expected impact of the covariates on the COVID-19 infection in the local community. Thereon, such information will extremely be useful to the local concerned authorities and for forecasting purposes. As regards to the factors, in the first wave, several factors such as public health measures, strong political engagement, stricter legislations, population behaviour to established sanitary norms [18], sensitization campaigns and the institution of quarantine centers were found to successfully curb the spread of the virus [1]. Considering the second wave, new covariates like the reproduction rate (ReR), the COVID-19 Risk due to weather conditions (CRW), the major event of vaccination and the COVID-19 Stringency Index need to be assessed in the Mauritian context. Besides, in the European and Asian regions, ReR [19][20][21], CRW [22,23] and vaccinations [24] have largely demonstrated their association with COVID-19 transmission while during the first wave in Mauritius, the COVID-19 Stringency index was the most significant factor in curbing the virus [4]. Further details on the covariates are provided in Section 3. An accurate forecasting with acceptable RMSE is also targeted because most restorative and preparedness policy decisions, be it in terms of adequate vaccines, financial requirement and re-opening of frontier, will be based on the new COVID-19 cases' projections.
The organization of the paper is as follows: In Section 2, the local active COVID-19 data structures and its descriptive statistics are provided. Section 3 emphasizes on the INAR model construction with some novel innovations and ZI distributions. The inferential properties of the INAR process are also discussed. Section 4 focuses on the fitting of the various INAR models and providing the possible short term forecasts. This section also comprises of the discussions on the several significant factors. The concluding remarks and some limitations are provided in Section 5.

The SARS-CoV-2 series data for Mauritius
The daily new COVID-19 infection and death series for Mauritius, covering the period from 18 March 2020 to 25 April 2021, summing to a total of 404 observations, were extracted from the official portal for European data (See https://data.europa.eu/data/datasets/covid-19-coronavirus-data?locale=en). The evolution of the COVID-19 new infection cases series and the death series are displayed below: From Fig 1, it can be deduced that at the beginning of COVID-19 pandemic, the situation was worrisome with the climbing number of daily deaths cases(in red), associated with the increasing trend in the number of new daily COVID-19 infection cases (in blue). As from April 2020 till December 2020, the spread of the SARS-CoV-2 in the local community has plunged and a long sequence of zero daily locally acquired new COVID-19 cases and deaths cases were reported. Note that a few COVID-19-related death cases were reported in March and April 2020, especially among patients with comorbidities. Next, new imported cases of COVID-19 infection cases were detected in October 2020 after the frontier was re-opened but these were successfully mitigated in quarantine centers. The mandatory 14-days of isolation in established quarantine centers proved its optimal effectiveness. Fig 2 covers the period from January to April 2021. In January and February 2021, a few active local cases among front-liners were reported but by accelerating the vaccination campaigns, the severity of the disease was reduced. It was in March 2021, that a sudden increase in the number of local infection was reported among some identifiable clusters. As for the death series, a long series of zero cases were reported. The relationship between the COVID-19 Stringency index in Mauritius (Refer to https:// ourworldindata.org/covid-stringency-index) and the number of daily new locally acquired COVID-19 active cases also plays a significant role, as proven in [4] and as shown in In Table 1 below, we provide the descriptive statistics and preliminary test results to confirm the nature of both data series along with their respective Auto Correlation Function (ACF) and Partial Auto Correlation Function (PACF) plots to determine the orders: Based on the qcc.overdispersion.test in R statistical software via the 'qcc' package, it is confirmed that the data is over-dispersed and due to the excess zeros in the data, the Vuong test, refer to Table 5 in S1 Appendix, and Van den Broek tests were also significant, proving that the series is zero-inflated as well. The Ljung-Box test ascertains the existence of serial

PLOS ONE
The SARS-CoV-2 series in Mauritius and an extended integer-valued time series models correlation in the series. The presence of trend was also significant. In fact, via the Cox-Stuart tests, it was found that the COVID-19 new cases has a decreasing trend and death series does not have an increasing trend. Details on these tests are shown in the S1 Appendix.
Refer to Fig 4, the ACF plots for both series demonstrate a slow decaying over 20 lags, and this is a basis of non-stationary for the time-series while the PACF plots confirm that both series are high-ordered (order = 7).
To address the non-stationarity issue, it is thus proposed to allow for covariate specification, which in the context of the COVID-19 analysis, is important. The following time-dependent covariates were considered: • The COVID-19 Stringency Index (SI): This variable has been calculated from nine metrics, namely school closures; workplace closures; cancellation of public events; restrictions on public gatherings; closures of public transport; stay-at-home requirements; public information campaigns; restrictions on internal movements; and international travel controls and is available on a daily basis at https://ourworldindata.org/covid-stringency-index. This index gives an indication of the strictness of government policies to the COVID-19 pandemic. The score is between 0 to 100 where an index nearing 100 indicates strict response otherwise less strict response [25]. In this study, the logarithm of the nominal value of the SI (log(SI)) was used for analysis purposes.
• The event of vaccination (Vaccine): This is an important time-varying variable because as at date, the vaccine roll-outs in Mauritius is rising given the authorities's aim to achieve herd immunity. More than 1 million vaccine doses have already been obtained through bilateral

PLOS ONE
The SARS-CoV-2 series in Mauritius and an extended integer-valued time series models agreements with India and China even though there is an intense competition between countries for the purchase the COVID-19 vaccine. For this study, the event of vaccination was categorised into two possibilities (binary) where 1 indicates that the event for vaccination is being done and 0 (ref) that the event of vaccination is not being done. Data on vaccination was obtained from https://ourworldindata.org/covid-vaccinations [26] and from the official COVID-19 platform in Mauritius, falling under the aegis of the Ministry of Health and Wellness https://covid19.mu/.
• The reproduction rate (ReR): This covariate refers to the degree of propagation of SARS--CoV-2 from one person to another. The data on ReR was obtained from https:// ourworldindata.org/covid-cases and the logarithm of the nominal value of ReR (log(ReR)) was considered. To note that this reproduction rate relates to the degree of transmissibility of the "original" SARS-CoV-2 and was used as a proxy to understand the severity of the coronavirus, considering that nowadays, constant mutation of the SARS-CoV-2 has been observed. In terms of proactive health related measures, this rate can be highly indicative.
The logarithm of the nominal value of ReR (log(ReR)) was considered.
• The Relative COVID-19 Risk due to Weather and Air Pollution (CRW): This variable represents some environmental factors and explains their impact on COVID-19 transmission. Weather factors like average and diurnal temperature, ultraviolet (UV) index, humidity, pressure, precipitation and air pollutants (SO2 and Ozone) were considered while computing this index. In this paper, the CRW was categorised into 0 and 1 (binary) where 0 (ref) is when an index is below 1 referring to relatively lower impact of weather factors on spread of COVID-19 and 1 is when an index is above 1, indicating otherwise. Data has been extracted from https://projects.iq.harvard.edu/covid19 and imputation based on observations were done for missing values.
Also, to cater for the long sequence of zeros and high autocorrelation, this paper bring forward the Zero-inflated (ZI) with different Poisson-mixture innovations models and the INAR(p) models because as illustrated in [2][3][4], ignoring the excess of zeros will lead to biasedness in the estimated parameters and standard errors [5]. More on these novel ZI models and the inferential part of the general INAR processes are provided in the subsequent Section.

The Zero-inflated poisson mixture models
The Zero Inflated (ZI) models, introduced by [6], are suitable for over-dispersed count data that exhibit excessive zeros. These data are commonly encountered in social sciences, likewise in the analysis of drug addicts [27], crimes [28], adolescents' drinking patterns [29], counselling session attendance [30], or in the financial sectors such as in the modelling of insurance claims [31], and in health studies such as in dental caries [32], in injection cessation in HIV patients [33] and among many other applications areas mentioned in [7].
Basically the ZI models is a mixture of two distributions: Firstly, a probability distribution that degenerates at zero and on the second stage, mixed with a standard probability model such as the Poisson or Negative Binomial (NB) model. The general form is given by: where π t , the mixing proportion and lies in the interval between 0 and 1, indicates as well the rate of zero inflation and g 1 (.) and g 2 (.) are the corresponding densities.
By replacing g 1 (.) with a probability distribution that generates at zero and g 2 (.) by the Poisson distribution with parameter λ t , we derive the ZI Poisson (ZIP) as: . . . and the corresponding probability generating function (PGF) is Similarly we can write the ZI Negative Binomial (ZI NB) with parameter (λ t , ν −1 ) and, recently, [34] proposed the ZI COM-Poisson model (ZI-CMP) where,

and its PGF is
where the Z(λ t , ν) is computed from [35] as: as λ ! 1 and where, Next, the Poisson-Tweedie (PT) model in [36] has also shown its efficacy in handling overdispersed and to some extent, data with excess zero, as discussed in [37][38][39]. The PGF of the PT model is given by: and its zero-inflated PGF (ZI-PT) is simply given by p t þ ð1 À p t ÞG R t ðsÞ. Furthermore, the PT function can be re-parameterized in terms of λ t , σ 2 , D t ¼ s 2 t l t and a where; Note the probability distribution of the PT model cannot be generally written in its explicit form (See [37]) whilst its probability values can be computed recursively as in [36] or using the method of [40], explained in the next subsection. It also important to note that a = 0 in Eq (3) corresponds to NB.
Apart from these models, the recently studied Cosine-Geometric models [4,41] is also proven useful for count data modelling. The PGF of WCG is given by: where l � t ¼ l t 1þl t and l � t 2 ð0; 1Þ and n 2 0; p 2 � � . The PGF of the ZI-WCG is In the event we have some explanatory variables, given by the vector x t , which are known to influence the t th response variable y = y t , then x t = [x 1 , x 2 , . . ., x p ], and l t ¼ expðx t T bÞ, for the t th term. In this context, p = 4, with x 1 = ReR, . . ., x 4 = CRW. For the zero inflated part, we assume the probability of zero is denoted by π t for y = y t , where p t ¼ expðx t T bÞ Overall, for the interested reader, the ZI data can easily be generated in R using ifelse(rbinom(n, size = 1, prob = π)>0, 0, rdis(n, λ � , ν)), where for the Poisson model, 'rdis' is rpois(n,λ = μ), for NB model, 'rdis' is rnbinom(n, size ¼ 1 n , mean=λ) and for COM-Poisson, 'rdis' is rcmp (n,λ,ν), (similar for the PT model, refer to the poistweedie package in R) or alternatively, the data can be obtained from ZIM [42] Note, since the marginal distribution of the counting series is not known, we follow the approach in [46] by conditioning on F t ¼ ½Y tÀ 1 ; Y tÀ 2 ; . . . ; Y tÀ p � to obtain and from here, the probability density values for Y t , t = p + 1, . . ., T can be obtained using the inversion technique [40].

Results
Following the model descriptions and properties, we apply the high ordered INAR to analyze the new COVID-19 infected series. We present the results in the Tables 2 and 3 below: The results in Tables 2 and 3 and Table 6 in S1 Appendix, were obtained assuming the training dataset from 18 March 2020 to 25 April 2021. It can be deduced that the Zero-Inflated Negative Binomial model (ZI-NB), given its lowest Akaike Information Criteria (AIC), outperformed the other competing ZI-PT, ZI-WCG and Poisson mixture models (See the results of Poisson mixture models in the S1 Appendix).
Referring to results in Tables 2 and 3, the variables 'ReR', 'SI', and 'Vaccine' were highly significant in reducing the number of infection in Mauritius, as compared to 'CRW'.
The ReR is directly associated with the number of new active cases [57]. This is because by observing the evolution of the series, it can be deduced that in October 2020 when there were an increase in international mobility [19] following opening of frontier and in March 2021 when the second wave has resurfaced, an exponential increase in the number of active COVID-19 cases was reported. At this point, a worrisome 'ReR' of above 1 was being reported, indicating high risk of getting infected. Fortunately, based on these trends in 'ReR' and new COVID-19 active cases, the authorities triggered timely health related measures like vaccination campaigns in Mauritius and consequently, the policies proved its effectiveness in April 2021, with a reduction in the number of new active COVID-19 cases, and in the 'ReR'. At this point, 'ReR' was below 0.5.
The event of vaccination indeed is playing a vital role in curbing the number in infection. Based on the reversed estimates of 'Vaccines', it can be deduced that as the vaccination campaigns take place, this is reflected positively in the share of Mauritian population which has already received at least one dose of the vaccines and likewise, the risk of getting infected is expected to decrease considerably. It has largely been proven that the COVID-19 vaccines reduces the overall attack rate by rendering the human immunity system more resilient. The chance for symptomatic and asymptomatic infections [58][59][60][61][62][63] and the severity of the symptoms [64,65] are considerably reduced, thus entailing an adverse effect on the mortality rate related to COVID-19. More elaborated comments are provided below. Timely imposition of new immediate sanitary measures during the peak COVID-19 phases also play an important role in curbing the spread of the virus. In fact, the quicker and earlier the sanitary measures are imposed, the more rapidly is the SARS-CoV-2 contained in the local community. Conversely, unlike other European regions, Mauritius reported its highest cases of COVID-19 in both warmer and colder regions, and in both weather conditions-summer and Winter, so 'CRW' was proven to be insignificant in curbing the number of active COVID-19 cases. In fact, given the constant mutation of the SARS-CoV-2 in different regions and Mauritius having a comparatively restricted regional disparity, possibly a larger dataset on 'CRW' will allow better exploration of its association with the number of infection [66,67]. Table 3 confirms that the estimates of the over-dispersion parameters in the ZI models are significant. In addition, the death series has also been analysed using the ZI-NB model due to its lower AIC. Below, the results have been presented.
From above Table 4, using the death series from 18 March 2020 till 25 April 2021, it can be concluded that all covariates except 'CRW' is highly significant in reducing the number of deaths related to COVID-19. The most important point to note is that in line with the results in Table 2 and as discussed in [58], the event of vaccination and the COVID-19 Stringency index have a substantial impact on the mortality rate related to COVID-19. As a matter of fact, in March and April 2021, Mauritius registered worrisome 8 deaths but once the vaccine Table 2

PLOS ONE
The SARS-CoV-2 series in Mauritius and an extended integer-valued time series models coverage has widened and adherence to non-pharmaceutical interventions has increased, the number of death cases has dropped to zero. Finally, we used the regression estimates for ZI-NB, as in Table 2, to conduct short term outsample forecasting of the number of new infected COVID-19 cases in Mauritius, from 26 April 2021 to 05 May 2021 and consequently, the ZI-NB model had the relatively lower Root Mean Square Errors (RMSEs) of 1.41. It can also be seen that the 95% confidence interval lies between 0 and 2 which means that during the next 10 days, that is from 26 April 2021 till 05 May 2021, there was 95% chance that the new COVID-19 infection case will lie between 0 and 2. In Fig 5 below, we demonstrate the 95% confidence interval plot: An in-sample forecast for next 5 days, from 21 to 25 April 2021, with 95% confidence interval, showed that with a RMSE of 4.36, the ZI-NB model is relatively the better model. Below, in Fig 6, the CI plot has been illustrated: Note that attention is drawn to the fact that due to some unmeasurable and unpredictable latent effects, the forecasted number of locally acquired new COVID-19 cases may not be easily estimated, especially in the long-run. In fact, in the rise of a sudden shock or spike, the forecasted values are naturally disrupted, since the predictor functions in the innovation distribution may not include a new physical or latent effect. Such a situation may be circumvented by updating the list of covariates on a daily basis and also by allowing the forecasts on a changepoint basis. Simultaneously, it is important to check the Variance Inflating factor (VIF) of the different regressors to avoid any multi-collinearity. Likewise, in the above analysis, the factor time was omitted due to the high VIF. We also note that in the high-ordered INAR process, the specification of latent effects may not be easily handled due to integrating the random effects (Refer to [1]).

PLOS ONE
The SARS-CoV-2 series in Mauritius and an extended integer-valued time series models

Discussion
In this study, useful INAR-type models were applied to the daily new COVID-19 infection cases and death cases while considering several covariates in order to understand the significant causes of the COVID-19 series and also to provide some reliable short term forecasts.

PLOS ONE
The SARS-CoV-2 series in Mauritius and an extended integer-valued time series models Based on the above results, the event of Vaccination-'Vaccine' and the COVID-19 Stringency Index-'SI', are found to be highly significant in mitigating the spread of SARS-CoV-2 in the local Mauritian context and hence, the authorities can further work on strategies to re-enforce the 'Vaccine' and 'SI' measures. As part of COVID-19 preparedness plan, new and re-enforced COVID-19 legislations like the most recent "Restriction of Access to Specified Institutions", upgrade in medical supplies and health equipment in terms of more personal protective equipment (PPE), high-tech protective masks like the novel ViriMASK, hospital beds for COVID-19 specialised hospitals amongst others, dynamic contact tracing teams, and more wellequipped laboratories for COVID-19 testing exercises, are further encouraged. Mauritius has it all but without the contributory support of the Mauritian population, nothing is worth. For a "COVID-19 free" Mauritius, concerned authorities are expected to boost the sensitization campaigns during this second wave of COVID-19 pandemic. Actually, slogans like "Sel Solution Vaccination" (Only solution is vaccination) are circulating on the social media but maybe by considering new motivating slogans like "Vacciner pu sauve nu pays" (Get vaccinated to save our motherland) can sensitize the population on the need to get vaccinated and the urgency to revive the Mauritian economy during this glooming economic scenario. Finally, the proposed INAR models can ultimately serve as an additional toolkit to the local authorities for better analysing and monitoring the evolution of the SARs-CoV-2 series in Mauritius.

PLOS ONE
The SARS-CoV-2 series in Mauritius and an extended integer-valued time series models