Real-time neural network based predictor for cov19 virus spread

Since the epidemic outbreak in early months of 2020 the spread of COVID-19 has grown rapidly in most countries and regions across the World. Because of that, SARS-CoV-2 was declared as a Public Health Emergency of International Concern (PHEIC) on January 30, 2020, by The World Health Organization (WHO). That’s why many scientists are working on new methods to reduce further growth of new cases and, by intelligent patients allocation, reduce number of patients per doctor, what can lead to more successful treatments. However to properly manage the COVID-19 spread there is a need for real-time prediction models which can reliably support various decisions both at national and international level. The problem in developing such system is the lack of general knowledge how the virus spreads and what would be the number of cases each day. Therefore prediction model must be able to conclude the situation from past data in the way that results will show a future trend and will possibly closely relate to the real numbers. In our opinion Artificial Intelligence gives a possibility to do it. In this article we present a model which can work as a part of an online system as a real-time predictor to help in estimation of COVID-19 spread. This prediction model is developed using Artificial Neural Networks (ANN) to estimate the future situation by the use of geo-location and numerical data from past 2 weeks. The results of our model are confirmed by comparing them with real data and, during our research the model was correctly predicting the trend and very closely matching the numbers of new cases in each day.


Introduction
Situation of world cov19 epidemia is very dynamic, and without models of prediction we are not able to estimate how the situation will develop. The problem in construction of such advisory systems is the lack of knowledge and data to compose them easily since we dont have any information about the virus spread before it start. We can estimate it just by the data from an extremely short period of time in which the situation changes rapidly. There are several methods which help to make prediction in such situations, however mostly reported are neural networks due to good generalization and precise prediction in case of uncertain environments.
In engineering neural networks work as predictors from micro-array data [13] and in photo-voltaic systems performance [14]. We can also find many other applications in real life, where we can base infrastructure decision about road accidents [15] or evaluate ecologic influence from ozone concentration [16]. Neural network predictors are also good in social modeling to predict suicide deaths [17].
In medicine neural networks are also very good predictors of new cases or progression of the disease. In [18] was presented how to model an ad-hoc predictor of arrhythmia from ambulatory electrocardiograms, which can support urgent decisions saving life. In medical systems data from various sensors can serve as knowledge for prediction models. Health sensoric data can be used to predict kidney diseases as shown in [19], heart failure as discussed in [20] or mRNA engineering from alternative polyadenylation [21]. Deep learning structure serves as pulmonary changes detection from cov19 virus as discussed by [22] and [23]. It is also possible to use neural networks as remote processing controls in cloud teleophthalmology IoT system as proposed in [24]. Some of models also give explainable premises of the disease symptoms as decision support model [25] or inflammation in Crohn's disease [26]. A wide survey on positive aspects of using machine learning techniques to predict medical symptoms from diabets was presented in [27]. Artificial Intelligence also serves as death or survival predictor. In many cases neural networks were reported to succeed in estimating situation of patients or future spread of some diseases which lead to fatal death of people. In [28] was described a model developed for predicting in-hospital mortality of patients with kidney injury, while in [29] model for survival of breast cancer was developed. In [30] neural networks were described to predict clinical events in Intensive Care Unit. Neural networks also can predict future spread of respiratory disease. In [31] was discussed how Artificial Intelligence can be used to predict infection from inhale of Pseudomonas aeruginosa in Intensive Care Unit. In recent time many works are oriented on computation for cov19 virus spread both in regional and world range. In Table 1 is presented a summary of recent approaches. From the table we can learn that various models have been used to predict the spread in local and global aspects. However each of these analysis is using mathematical solutions based on calculations. Our proposed approach is oriented on machine learning (especially on devoted neural network architecture) and because of this proposed system is able to predict the cov19 spread trend Pandey et al. [6] analysis of registered and potential cases and regression model India Ivanov et al. [7] impact of cov19 on economy by using anyLogistix simulation and optimization software global Gatto et al. [8] cov19 spread by using metacommunity Susceptible-Exposed-Infected-Recovered (SEIR) transmission model Italy Petropoulos et al. [9] cumulative data analysis global Anastassopoulou et al. [10] cov19 spread by using Susceptible-Infectious-Recovered-Dead (SIDR) model Hubei region, China Bertozzi et al. [11] data driven approach to prediction by analyzing number of cases USA for various countries and regions in any place on the Globe it makes a big advantage for our proposal.
In this paper we present an idea for the real-time prediction model based on neural networks. The system is using simple data available from governmental services to estimate the situation worldwide. The lack of general knowledge how the virus spreads and what would be the main reason for this make in not easy to predict the number of cases each day. Therefore prediction model must be able to conclude the situation from few past data information in the way that the results will show a reliable trend and will possibly closely relate to the real number of cases. Proposed prediction model is using neural networks to do it. At the beginning the data is normalized to help find relations between neighbor countries or regions and after that an input vector is presented to the nested neural network model in which we have composed devoted architectures to predict situation. First neural network is working on 12 days time window to predict the situation in the World, while second one is working with 8 days time window to predict the situation in regions of a country. The proposed model works well in both cases and is able to give a helpful estimation of the trend which can be used for planning but also closely approaches the real number of cases each day.

Neural prediction model
Let us now present the idea for the developed prediction model composition and training from the input data.
We use the dataset obtained from Johns Hopkins University Center for Systems Science and Engineering and available the on-line GitHub repository at https://github.com/datasets/ covid-19. Simultaneously the data can be take from governmental services in each of countries. The dataset file is firstly pre-processed. Since we want the input data used by our neural network to be consistent, we have divided our data file into sections, using the following Algorithm Algorithm 1 for data pre-processing.
• The first section is the location name. It is not used in the training phase but it is used in the final step to create results table.
• The second section is geographical location given in longitude and latitude. It is used in the model training to specify which region has the biggest danger factor based on the neighboring countries and regions. The combination of geographical position is very useful for the system to predict how the number of confirmed cases may grow in countries or regions in relation to the surrounding ones.
• The third and final section contains 12 (in case of world model) and 8 (in case of region model) latest days of total confirmed cases for specified locations. Using such data the network can predict the approximate trend, which specifies how fast the virus will spread by showing the predicted number of cases in each of them.
Then the data before forwarding to the neural network is normalized. All numerical values are divided by the maximum value to normalize it to the range of values from 0 to 1. It helps to train the network and allows us to use the non-linear activation functions such as hyperbolic tangent. Schematic representation of the proposed prediction model is presented in Fig 1, which shows each stage of data processing from extraction to prediction.
Algorithm 1 Data pre-processing algorithm

Neural network
In the proposed predictor two separate neural networks cooperate on the input data processing to predict the growth of numbers and trend. Architecture one is bigger, since it is developed for the prediction of countries. In Fig 2 we can see how this architecture is composed. This neural network accepts the data from last 12 days along with geolocation in latitude and longitude. Being more specific, the day N prediction for the global model is based on days N-5 to N-17 and the day N prediction for the regional models are based on days N-5 to N-13. This information is processed on three hidden layers of first type and two hidden layers of second type. The number of neurons in each of them is selected empirically after tests we have done with the datasets used in our research. In Fig 3 we can see an architecture of neural network Firstly we normalize input data to make the model more accurate for each of them. After, the data is forwarded to two neural network architectures. One, developed for the World prediction model, which is bigger due to wider spectrum. Second, developed for country regions prediction model, which is smaller.
https://doi.org/10.1371/journal.pone.0243189.g001 composed for regional prediction. This architecture is smaller as we can see in the number of neurons in first type layers. In both cases the neural network is implemented with two types of activation functions: hyperbolic tangent (for the first type layer) and relu function (for the second type layer).

System training
For the training of the neural networks, we have used an adaptive moment estimation algorithm called Adam [32]. It is the latest trend in research on usability of neural networks because of fast and not demanding processing ability. Adam algorithm is based on the first and second moments of gradients. To introduce this algorithm, it is necessary to provide basic formula for mean and variation. These coefficients in t-iteration are based on values in previous iteration marked as t − 1. Adam formula, as a combination of 1st momentum and RMSprop, can be described as follows where β parameters are constant values called hyper-parameters and g is the current gradient value of error function for the neural network training. Values m t and t m are used for calculation of the correlations marked asm t andv t according tô Using above calculated correlation of mean and variation, the final formula for changing weights in our neural network can be defined as a change between current weight w t and calculated correlations where η is a learning rate and � is a constant small value. The whole procedure is presented in Algorithm 2.

Results
Let us now discuss the results and efficiency of the proposed prediction model.

Finding the best data amount for our predictors
In our research we have been searching for the number of days which should be given to the neural network for best prediction result. Because we started our research in a very early stage of global epidemia the dataset was very small and contained a very limited amount of days.
Thus we wanted to reduce the amount of days needed to predict the future values to the minimum. That's why we have done some experiments which results are shown in Fig 4. As we can see that using less days on the input resulted in lower stability of prediction with such data, although the main trend curve was correct. Going over 12 days, however, did not change a lot so we decided that 12 days for the global predictor would be the best point. The same thing was happening in the regions predictor, however because it worked on much smaller scale less days were needed and the best value was around 8 days.

Data normalization
Because each day our network is retrained to get the most accurate predictions, we normalize data in the way below: • First we divide data into 2 sections: geo-localization values and total cases values, • Next, we divide all total cases values by the maximum value from the dataset to get input values normalized from 0 to 1, • Finally we normalize geo-localization data separately using min-max algorithm where min and max are defined by maximum and minimum latitude and longitude values.
Because we have used ReLu activation functions our generated predictions can exceed the maximum value from the training dataset, so after retraining our newly generated predictions are "denormalized" by multiplying all values by the maximum value of total cases we have divided before during normalization.

Test/train division
There are hundreds of variables defining COVID-19 spread speed and they vary in each country. So to correctly predict values for all countries we could not divide our dataset to test/train using the standard way, where we move some countries to test and the rest to training data because it would lead to more errors in the final results and the network would not fit well the curves of individual countries, however it should still perform well in the total cases sum of all countries.
Therefore we had to find another solution to test our model's performance. To do this we divided our data to test/train not by countries but by days. Results from different period of training are shown in Fig 4. We have trained our model on data from 8 days before the newest date and older, and tested it on the newest data. Because of that our network could correctly adapt to every country separately and we could test if it generates the trend curve successfully or not. The only possible drawback of this approach is that our predictor network may have a little slip in the beginning of the epidemic period in some countries but after few days it adapts shortly giving us valid results.
In our experiment we tried to achieve the lowest possible error value for Mean Squared Error function. We did some experiments to see what lowest value of the error our proposed neural network can achieve. We were aiming to have the error at the level of about 0.01. At 4000 iterations, the network obtained the minimum of the loss function for our assumed level and then continued to decrease but actually without any spectacular changes. Therefore 4000 was experimentally considered the golden mean in the number of training iterations for our model.

Classic statistical approach
Statistical approach has been used to model prediction lines which were compared to our proposed neural network model. In our research as comparisons we have used classical measures: Simple Moving Average, Trend Line and Exponential Moving Average. These measures are influenced by many factors, such as number of tests carried out in given country or region, restrictions imposed by government, restrictions imposed by state authorities, and finally behavior of the population in given area. Anyway classical measures of statistic are frequently used to make predictions. All applied statistical measures are presented in charts for sample countries and regions in Figs 8 and 7.
Simple Moving Average (SMA) is modeled by equation: where n is a number of factors taken into account, c i is i-th value of the considered set. Exponential Moving Average (EMA) is modeled by equation: where EMA 0 = Y 0 and Y is i-th value of the considered set, a ¼ 2 nÀ 1 where n is the number of values of the considered set.
Trend Line (TL) was built by using equation: where X i and � X are as follows: i-th value of the considered set and arithmetic mean of the considered set X, Y i and � Y are as follows: i-th value of the considered set and arithmetic mean of the set Y, a is directional coefficient of the straight line, b is free expression element.

Numerical results
In Fig 5 we can see predictions of our model for countries on our planet. The spots are representing predicted trends of cases for world countries. Color red indicates increase, while color green indicates decrease of cases. The bigger the size of the spot the higher growth of cases our system predicts. Fig 6 presents the trend in total number of cases. The results we can see in the image were predicted using data from past weeks, while these results show prediction for next week. Our model was verified in past days, since prediction was correct and the trend of cov19 was correctly predicted by our neural network model we confirmed efficiency of the solution.  Presented results show that applied metrics may have important differences in prediction when compared to official data. Therefore our proposed neural network predictor can solve such problem and reliably predict future trends.
In Figs 9 and 10 we can see some example prediction charts of our model compared to real numbers provided by governmental services worldwide. When we analyze these results we can see that the model presented in Fig 2 works well. Predicted lines closely meet the real numbers for various countries. Similar situation is visible for model presented in  Table 2 we can see the    final accuracy of our prediction model margin for countries, and in Table 3 for some regions in USA, Australia, Canada and China.
In our research we have used a measure of error margin to evaluate accuracy of our system. Predicted cases count is categorized as matching: where a is match. Because of that our margin is relative to the cases count and allows us to better validate the overall accuracy of our proposed system. Tables 2 and 3 present the final level of accuracy achieved. As we can see the change depends on the level of error margin accepted for the research experiment. The higher the margin the more accurate the system is. Present daily changes of accuracy of our prediction model for selected countries and regions in the World show that the highest decrease in accuracy of our model prediction for world was in the time when cov19 was spreading among continents and where authorities we introducing periodical lockdown, however our model gained accuracy again very fast and adjusted to the rapidly changing situation.

Discussion
Results of our prediction model show many strong points in the proposed approach. The structure of the developed system gives many advances. Proposed two types of neural architectures are devoted to prediction from small cov19 data so that the results both for countries and regions are well adequate to the real numbers. Applied two types of activation functions gave the neural network ability for exact fitting to the normalized data from various locations. The architecture is trained by ADAM algorithm so that the error rate is low and the system is well developed. The situation is changing fast so we can also see changes in the predicted numbers reflected in our system accuracy. On the other hand the system gains efficiency very fast and learns the new data with good precision. The statistics of the system show that with new incoming data proposed model is better trained to prediction about the situation in each country or region.

PLOS ONE
Proposed neural network predictor has an important advantages over other approaches. First of them is good adjustment to the new information. When the neural network is trained it easily adopts to new data and gives correct predictions. It is not necessary to analytically model the trend line, which is the key factor for purely mathematical predictors. On the other hand for neural networks well training a reasonable amount of data on the input is necessary to achieve good accuracy. However as we have shown in our model we can simulate these by architecture of the neural network and data preprocessing. Due to the nature of incoming information in this case we have developed two architectures for countries and regions, but each of them is fed with different number of values from previous days. The interpretation of our results is the trend of the situation (growing or decreasing) and potential number of new cases. For both of them our neural network predictor works well.

Conclusions and future works
We have presented developed system for prediction of cov19 spread. Applied neural network architectures give good results and help to show the growth trend for countries and regions. The situation of any disease or epidemia is changing very fast so it is hard for any mathematical model to perfectly fit to the real numbers. On the other hand prediction models sourced in Artificial Intelligence give flexibility to the changing situation to match new incoming data while maintaining accuracy in prediction. Therefore such models can be used to reliably estimate changes in uncertain environments by using predicted tends of growth. The hards thing in case of cov19 prediction is that the project started with very limited number of information about cases and additionally the situation was changing rapidly from day to day. Therefore in such conditions it is very hard to model and train the neural network to achieve reliable results. Although such conditions our model proved to work well. Applied data selection on the input, normalization and proposed division made it possible for our proposed architectures to train well for predictions in countries and regions.
Our future works will be oriented to introduce some technique for automatic adjustment of the neural network to newly incoming data. We think that transfer learning may help in that case, ei. when similar region or country have similar values of recorded numbers so that procedures of transfer learning maybe help in development of prediction system. Especially such procedure may be important when in the beginning of projects there is a little number of information available.

Acknowledgments
Authors would like to acknowledge contribution to this project from the Rector of the Silesian University of Technology under program "Initiative of Excellence-Research University" grant no. 08/IDUB/2019/84.
We also express our thanks to the anonymous reviewers, whose comments and suggestions have contributed significantly to the quality of our manuscript.