Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

On mobility trends analysis of COVID–19 dissemination in Mexico City

  • Kernel Prieto ,

    Contributed equally to this work with: Kernel Prieto, M. Victoria Chávez–Hernández, Jhoana P. Romero–Leiton

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Writing – original draft, Writing – review & editing

    kernel@ciencias.unam.mx

    Affiliation Instituto de Matemáticas, Universidad Nacional Autónoma de México, Mexico, México

  • M. Victoria Chávez–Hernández ,

    Contributed equally to this work with: Kernel Prieto, M. Victoria Chávez–Hernández, Jhoana P. Romero–Leiton

    Roles Conceptualization, Formal analysis, Methodology, Software

    Affiliation Facultad de Ingeniería Mecánica y Eléctrica, Universidad Autónoma de Nuevo León, San Nicolás de los Garza, Mexico, México

  • Jhoana P. Romero–Leiton

    Contributed equally to this work with: Kernel Prieto, M. Victoria Chávez–Hernández, Jhoana P. Romero–Leiton

    Roles Investigation, Methodology, Writing – review & editing

    Affiliation Facultad de Ingeniería, Universidad Cesmag, Pasto, Colombia

Abstract

This work presents a tool for forecasting the spread of the new coronavirus in Mexico City, which is based on a mathematical model with a metapopulation structure that uses Bayesian statistics and is inspired by a data-driven approach. The daily mobility of people in Mexico City is mathematically represented by an origin-destination matrix using the open mobility data from Google and the Transportation Mexican Survey. This matrix is incorporated in a compartmental model. We calibrate the model against borough-level incidence data collected between 27 February 2020 and 27 October 2020, while using Bayesian inference to estimate critical epidemiological characteristics associated with the coronavirus spread. Given that working with metapopulation models leads to rather high computational time consumption, and parameter estimation of these models may lead to high memory RAM consumption, we do a clustering analysis that is based on mobility trends to work on these clusters of borough separately instead of taken all of the boroughs together at once. This clustering analysis can be implemented in smaller or larger scales in different parts of the world. In addition, this clustering analysis is divided into the phases that the government of Mexico City has set up to restrict individual movement in the city. We also calculate the reproductive number in Mexico City using the next generation operator method and the inferred model parameters obtaining that this threshold is in the interval (1.2713, 1.3054). Our analysis of mobility trends can be helpful when making public health decisions.

Introduction

The coronavirus disease 2019 (COVID-19) is caused by a novel coronavirus. The coronaviruses are a family of viruses that cause infection in humans and animals. The diseases that are by a coronavirus are zoonotic [1]. In particular, the coronaviruses that affect humans (HCoV) can produce clinical symptoms, such as the Severe Acute Respiratory Syndrome (SARS) viruses and Middle East Respiratory Syndrome (MERS-CoV) [2]. COVID-19 was first identified amid an outbreak of respiratory illness cases in Wuhan City, Hubei Province, China. This disease was initially reported to the WHO on 31 December 2019. On 11 March 2020, the WHO declared COVID-19 to be a global pandemic [3]. From the beginning of the epidemic to 21 January 2021, more than 97,890,676 cases and 2,094,459 deaths have been reported globally.

The first case of COVID-19 in South America was registered in Brazil on 26 February 2020. The first death from this infection in this region was announced in Argentina on 7 March 2020. The virus then arrived in Mexico, where by 21 January 2021 there have been almost 1,688,944 confirmed cases and 144,371 deaths.

To date, many researchers around the world have focused on understanding the transmission dynamics of COVID-19 disease using mathematical and statistical models and methods, see for example [412]. In this work, we will focus on those models that incorporate information on human movement. The relationship between human mobility and the transmission of coronavirus disease in the United States has been studied in [13, 14]. Metapopulation models are not only among the simplest spatial models but they are also the most applicable to modelling many human diseases [15]. The metapopulation concept is to subdivide the entire population into distinct sub-populations, each of which has independent dynamics together with limited interaction between the sub-populations. This approach has been used to great effect within the ecological literature [16] and it has recently been used to model the spread of COVID-19; see, for example, [1721].

In this work, we calibrate the metapopulation model proposed by Li et al. in [21], similar to [22], using incidence data reported in [23]. Consequently, we first describe the mathematical model that we have used; then, we compute the number of trips that are produced and attracted in each borough of Mexico City using data about these trips in 2017 [24], which then we combine with the rates of reduction or increase in mobility during the pandemic reported by Google [25] and the government of Mexico City [26]. Later, by using Bayesian inference, we solve the associated inverse problem to predict the dynamics of the spread of cases, similar to the following references [2733]. Our conclusions are presented in the last section.

Computation of the mobility matrices

To incorporate mobility in the transmission model, the produced and attracted trips in the boroughs of Mexico City are considered (see Fig 1 and Table 1). Mexico City is the capital of Mexico. It has around 9 million inhabitants and a floating population of over 22 million, who are composed of daily commuters and international visitors. Mexico City is among the top 10 most crowded cities in the world [34]. It also has a large number of corporate headquarters and a large transport network, which is composed of 20 different modes of transport.

Mobility between the zones in Table 1 is represented in a two-dimensional arrangement, which is known as the origin-destination matrix (O-D matrix) M = {Mij}, i, j = 1, …, 16, where Mij represents the number of trips from zone i to zone j. Origin-destination matrices are usually obtained every 10 years from surveys. In Mexico City, the last survey was carried out in 2017 [24]. The information available identifies, among other things: if the trip was made on a weekday or if it was made during the weekend, the transport mode used, the purpose and the time. It is important to notice that, due to the complexity of mobility in Mexico City, the O-D matrix does not have to be symmetric nor the sum of the row i has to be equal to the sum of the column i. This can be explained because the O-D matrix captures chains of trips that may begin one day and end within the next few days. For instance, trips of people who leave their home to go to zone i to work and then go to zone j to see a movie before returning home, or people that work in zone i for few consecutive days (see [35]). In this paper, we consider all of the trips and they are identified by the mode of transport that was used in the area of interest (see Table 2).

There are several methodologies to update the O-D matrix in the literature. Most of them combine known information with current data observed, such as the number of trips in some segments of the transit network [36]. There are also some approaches that project the trips to/from each zone based on the projected economic growth in those areas [37]. Nevertheless, given the pandemic situation that we are experiencing today and that we have current available data about the increase or decrease in mobility for some modes of transport, transit stations and parking lots, we consider the 2017 O-D matrix as a reference matrix and we update it to a scenario in 2020 using the daily mobility reports provided by Google [25] and the government of Mexico City [26]. According to [24], Tables 3 to 6 represent the number of trips between these zones; for instance, the mean number of trips whose origin is Coyoacán (id = 2) and destination is Iztapalapa (id = 6) during a week day is 228,272 (see Table 3) and the mean number of trips whose origin is Tláhuac (id = 10) and destination is Cuauhtémoc (id = 14) is 21,881 (see Table 6) during a day on the weekend.

thumbnail
Table 3. Mean number of trips from zones 1-16 to zones 1-9 during a week day.

https://doi.org/10.1371/journal.pone.0263367.t003

thumbnail
Table 4. Mean number of trips from zones 1-16 to zones 10-16 during a week day.

https://doi.org/10.1371/journal.pone.0263367.t004

thumbnail
Table 5. Mean number of trips from zones 1-16 to zones 1-9 during a day of the weekend.

https://doi.org/10.1371/journal.pone.0263367.t005

thumbnail
Table 6. Mean number of trips from zones 1-16 to zones 10-16 during a day of the weekend.

https://doi.org/10.1371/journal.pone.0263367.t006

In Tables 36, the last row represents the total number of trips attracted by each zone and the last column in Tables 4 and 6 represents the total number of trips generated by each zone. This way we can see in Table 4 that during a week day 849,911 trips are attracted to zone 10 and 540,274 trips are generated from zone 7.

To compute the new number of trips made using the subway, RTP/M1, trolleybus, light rail or suburban, we used the corresponding rates given by the government of Mexico City. To compute the number of trips made using collective/micro or buses, we used the rates of transit stations given by Google and for personal transportation we used the workplaces rate. To compute the number of trips made by bicycle, we used the rates for Ecobici and for metrobus/mexibus we used an average of both the rates of metrobus and mexibus in Mexico City. The other transport modes remain the same as in 2017.

Fig 2 shows the variations in the mobility indices from 27 February to 31 November, for each mode of transport that was modified.

thumbnail
Fig 2. Variations in the mobility indices from 27 February, 2020 to 31 November, 2020.

https://doi.org/10.1371/journal.pone.0263367.g002

The population of each borough is also considered in our mathematical model. According to [38], these populations in 2020 are given in Table 7.

thumbnail
Table 7. Population in 2020 for each borough of Mexico City.

https://doi.org/10.1371/journal.pone.0263367.t007

Note that both, the mobility rates from Google and the government of Mexico City and the reference O-D matrix (the one from 2017) are given per day. This way, the population Ni could be considered as constant; but only per day. In this work, a phenomenon whose duration is of the order of months is studied, so that for the entire period of estimation Ni is considered as a variable.

Furthermore, although in all boroughs the largest number of trips are carried out within the same borough, the distribution of trips to/from the other boroughs does not follow the same pattern in all cases. For example, the Coyoacán borough attracts the highest number of trips from the Tlalpan borough and the least amount of trips from Cuajimalpa; meanwhile, the borough of Iztapalapa attracts the highest number of trips from the Cuauhtémoc borough and the least amount of trips from La Magdalena Contreras. This phenomenon can be explained by the prevailing economic activity in each borough and the transportation connectivity with the others.

In order to obtain an stable algorithm for modeling correctly the migration of people, we use the Fratar method to balance all the origin-destination matrices [39].

Clusters

In this section, we describe the clustering analysis that we implemented on Mexico City based on mobility data. This analysis is presented not only to try to find some possible socioeconomic relations between some boroughs, but because there exist computational challenges which can be avoided creating clusters. Firstly, in order to solve model (1) for the whole Mexico City, the program uses around 50GB in RAM during the compilation process using the Stan package. Secondly, the computation time for estimating the parameters is around 3 days. We have used a computer with Ubuntu 20.04, 64 GB in RAM and 12 cores. Therefore, we propose in this section how it could be avoided both of these computational challenges. Solving model (1) for each cluster would reduce the amount of RAM memory used and the computation time. Moreover, this strategy could be implemented simultaneously using the t-walk package for example, since the t-walk package only uses one core for execution program.

During the pandemic, the Mexican government has scheduled four phases depending of level of contagion risk. These phases corresponding to the following periods: phase 1: from 27 February February to 22 March; phase 2: from 23 March to 19 April; phase 3: from 20 April to 28 June; phase 4: from 29 June to 27 October. The mobility network was analysed using the community detection module Louvain inside the igraph R package [40]. For more details about the igraph R package, see [41]. Thus, using the Louvain community detection algorithm, we are able to identify that Mexico City’s network has a modular structure, with three communities, as shown in Figs 3 and 4. From Figs 3 and 4 we observe that the communities in the first and fourth phases of the pandemic are the same, and the second and third of the pandemic are the same. The community 1 of the first phase of the pandemic is composed of boroughs 1, 4, 14, 15 and 16; community 2 is composed of boroughs 2, 3, 7, 8, 9, 11, 12 and 13; and, community 3 is composed of boroughs 5, 6 and 10. The community 1 of the second phase of the pandemic is composed of boroughs 1, 4, 14, 15 and 16; community 2 is composed of boroughs 2, 5, 6, 8, 10, 11, and 12; and, community 3 is composed of boroughs 3, 7, 9 and 13.

thumbnail
Fig 3. Borough clusters of Mexico City.

(A): Borough clusters for the first period of the pandemic. (B): Borough clusters for the second period of the pandemic.

https://doi.org/10.1371/journal.pone.0263367.g003

thumbnail
Fig 4. Borough clusters of Mexico City.

(A): Borough clusters for the third period of the pandemic. (B): Borough clusters for the fourth period of the pandemic.

https://doi.org/10.1371/journal.pone.0263367.g004

Mathematical model

As we mention in the introduction, the transmission model incorporates information on human movement within the following Susceptible, Exposed, Infected, Recovered (SEIR) metapopulation structure [21]: (1) where Si, Ei, Ai, Ii and Ni are the susceptible, exposed, undocumented infected, documented infected and the total population in borough i at time t, respectively, and denotes the fixed population in borough i given by Table 7. Spatial coupling within the model is represented by the daily number of people traveling from city j to city i (Mij) an a multiplicative scale factor θ, reflecting the under-reporting of human movement. It is also assumed that documented infected individuals (Ii) do not move between boroughs, although these individuals can move between boroughs during the latency period. The total population Ni in each borough is reset each new day as the sum of and the inflow term θj Mij, minus the outflow term θj Mji. We note that distinction between daytime and nighttime in the transmission model 1 is implemented in [35]. A complete description of the parameters involved in the model (1), the respective range of values proposed in [21] and their measurement units can be found in Table 8. We have set a minimum value for the denominator NjIj or NiIi as equal to 103 in order to avoid instabilities.

thumbnail
Table 8. Parameter description and values proposed in [21] of the state equations given on (1).

https://doi.org/10.1371/journal.pone.0263367.t008

Parameter estimation

For parameter estimation, we use the daily reported dataset [23]. We use Bayesian inference to solve the inverse problem associated to the system of Ordinary Differential Equations (ODEs) given on (1), similarly to [33]. Some references using this method of parameter estimation can be found in [4253].

Let us denote the vector of state variables in the zone i as x = (Si, Ei, Ai, Ii) ∈ (L2[0, T])n, where n = 4 denotes the number of state variables and the vector of parameters in the zone i as , where m = 10 denotes the dimension number of parameters to estimate. Thus, we can write the model (1) as the following Cauchy problem (2)

Problem (2), defines a mapping Φ(θ) = x from parameters θ to state variables x, where where denotes the non-negative real numbers. We assume that Φ has a Fréchet derivative. Usually, not all states of the system can actually be directed measured, i.e., the data consists of measurements of some state variables at a discrete set of points t1, …, tk, e.g. in epidemiology, these data consist of number of cases of confirmed infected people. This defines a linear observation mapping from state variables to data , where sn is the number of observed variables and k is the number of sample points. Let us define as F(θ) = Ψ(Φ(θ)), called the forward problem. Thus, the inverse problem is formulated as a standard optimization problem (3) such that x = Φ(θ) holds, with yobs is the observable data which has error measurements of size η.

Problem (3) may be solved using numerical tools to deal with a non-linear least-squares problem [5458]. In this work, we implement Bayesian inference to solve the inverse problem given on (3). From the Bayesian perspective, all of the state variables x and parameters θ are considered as random variables and the data yobs is fixed. For the random variables x and θ, the joint probability distribution density of the data x and the parameters θ, denoted by π(θ, x), is given by π(θ, x) = π(x|θ)π(θ), where π(x|θ)π(θ) is the conditional probability distribution, which is also called the likelihood function, and π(x|θ) is the prior distribution, which involves the prior information of parameters θ. Given x = yobs, the conditional probability distribution π(θ|yobs), which is called the posterior distribution of θ, is given by the Bayes’ theorem: (4)

If an additive noise is assumed where η is the noise due to discretisation, the model error and the measurement error. If the noise probability distribution πH(η) is known, θ and η are independent, then

All of the available information regarding the unknown parameter θ is codified into a prior distribution π(θ), which specifies our belief in a parameter before observing the data. All of the available information regarding how we obtained the measured data is codified into the likelihood distribution π(yobs|θ). This likelihood can be seen as an objective or cost function because it punishes deviations of the model from the data. To solve the associated inverse problem (4), one may use the maximum a posterior (MAP)

We used the dataset in the zone i as , which correspond to the susceptible, exposed, documented infected and undocumented infected in the zone i, respectively. A Poisson distribution with respect to the time is typically used to account for the discrete nature of these counts. However, the variance of each component of the dataset yobs is larger than its mean, which indicates that there is over-dispersion of the data. Thus, a more appropriate likelihood distribution is to use the Negative Binomial (NB) because it has an additional parameter that allows the variance to exceed the mean [50, 51, 59]. The NB is a mixture of Poisson and Gamma distributions, where the rate parameter of the Poisson distribution itself follows a Gamma distribution [59, 60]. We note that although there are different mathematical expressions for the NB depending on the author or source, they are equivalent. Because of this multiple representation of the NB in the literature, one must ensure to use the NB distribution accordingly to the source. Here, we have used the following expression for the NB distribution (5) where μ is the mean of the random variable and ϕ is the over-dispersion parameter; that is,

We recall that the Poisson distribution has mean and variance equal to μ, so μ2/ϕ > 0 is the additional variance of the NB with respect to the Poisson distribution. The inverse of the parameter ϕ, controls the over-dispersion. Thus, it is important to select its support adequately for parameter estimation. In addition, there are alternative forms of the NB distribution. We have used the first option neg_bin of the NB distribution of Stan [61]. We acknowledge that some scientists have had success with the second alternative representation of the NB distribution [47]. We assume independent NB distributed noise η (i.e., all dependency in the data is codified into the contact tracing model). In other words, the positive definite noise covariance matrix η is assumed to be diagonal. Therefore, using the Bayes formula, the likelihood is where i denotes the borough index. As mentioned earlier, we approximate the likelihood probability distribution corresponding to diagnosed cases with a NB distribution (6) where the index j denotes the number of days, i the number of the boroughs, and ϕi are the parameters corresponding to the over-dispersion parameter of the NB distribution (5) respect to each borough.

For independent observations, the likelihood distribution π(y|θ) is given by the product of the individual probability densities of the observations where the mean μ of the NB distribution , is given by the solution Ii(t) of the model (1) at time t = tj. For the prior distribution, we select the LogNormal distribution for βs and βa parameters, Gamma distributions for α and γ parameters and Uniform distributions for the other parameters to estimate: ρ, θ and initial conditions . The hyperparameters and their support corresponding to all the distributions of the parameters to estimate are given on the table’s range Table 8. (7)

The posterior distribution π(θ|yobs) given by (4) does not have an analytical closed form since the likelihood function, which depends on the solution of the non-linear model given on (1), does not have an explicit solution. Then, we explore the posterior distribution using two methods: first, the Stan Statistics package [61] within its version the Automatic Differentiation Variational Inference (ADVI) method,; and second, the general purpose Markov Chain Monte Carlo Metropolis-Hasting (MCMC-MH) algorithm t- walk [62]. Both algorithms generate samples from the posterior distribution π(θ|yobs) that can then be used to estimate marginal posterior densities, mean, credible intervals, percentiles, variances, and others. We the reader refer to [63] for a more complex description of the MCMC-MH algorithms.

Fig 5 shows the credible intervals of parameters of model (1) within 95% Highest-Posterior Density (HPD) using the ADVI-Fullrank method of Stan package [61]. Table 9 shows the posterior mean and quantiles of all the estimated parameters of model (1) using the ADVI-Fullrank method of Stan package. Table 10 shows the posterior mean and quantiles of all over-dispersion parameters ϕi of the Negative Binomial distribution (6). Fig 6 shows the joint probability density distributions of the estimated parameters of model (1) within 95% (HPD). The blue lines represent the medians. Figs 79 show the fit of confirmed COVID-19 cases of all of the boroughs of Mexico City using the Stan [61]. Fig 10 shows Credible intervals of parameters of model (1) within 95% Highest-Posterior Density (HPD) using the t-walk Package [62]. Note that the result obtained with the t-walk package are preliminary because we only performed 60,000 iterations, with 30,000 of them as burn-in and it was obtained without balancing the Origen-Destination matrices. We performed this limited quantity of iterations because the computational time consumption is significantly large for each 1,000 of iterations. However, we will perform more iterations in the near future. Using both packages, we did a fit for the first 245 days of the pandemic in Mexico City, starting 27 February, and we have performed predictions from 245–275 days, corresponding to 28 October to 27 November and compared with the true cases in this last period 28 October to 27 November. We assumed that the mobility from 28 October to 27 November is the same as from 28 September to 27 October, i.e., we assumed the same mobility cluster for the projection period. We set up a minimum borough fraction equal to 0.6 to limit the borough to fall below their population size. Our future work will analyse the identifiability of the parameters of model (1), as suggested in [49, 64, 65]. Specifically, the ρ parameter because it is multiplied by the period of incubation of the disease, α. Thus, estimating both parameters simultaneously may lead to non-identifiability difficulty. we have uploaded all the codes and source data used in this paper to the following Github link for a detailed review.

thumbnail
Fig 5. Credible intervals of parameters of the model (1) within 95% Highest-Posterior Density (HPD) using the Stan package [61].

https://doi.org/10.1371/journal.pone.0263367.g005

thumbnail
Fig 6. Joint probability density distributions of the estimated parameters of model (1) within 95% (HPD).

The blue lines represent the medians.

https://doi.org/10.1371/journal.pone.0263367.g006

thumbnail
Fig 7. Fit of confirmed COVID-19 cases of the boroughs 1 to 6 using the Stan package [61].

Top row from left-hand to right-hand: the fit for the confirmed cases of the Districts Azcapotzalco and Coyoacan. The tomato colour bars represent the confirmed cases, the blue and purple solid lines represent the median and the mode, respectively, and the shaded area represent the %95 probability bands for the expected value for the state variable of Documented Infecteds. Middle row from left-hand to right-hand: the fit for the diagnosed cases of the Districts Cuajimalpa de Morelos and Gustavo A. Madero. Bottom row from left-hand to right-hand: the fit for the diagnosed cases of the Districts Iztacalco and Iztapalapa.

https://doi.org/10.1371/journal.pone.0263367.g007

thumbnail
Fig 8. Fit of confirmed COVID-19 cases of the boroughs 7 to 12 using the Stan package [61].

Top row from left-hand to right-hand: the fit for the confirmed COVID-19 cases of the Districts La Magdalena Contreras and Milpa Alta. The tomato colour bars represent the confirmed COVID-19 cases, the blue solid line represent the median and the shaded area represent the %95 probability bands for the expected value for the state variable of Documented Infecteds. Middle row from left-hand to right-hand: the fit for the diagnosed cases of the Districts Alvaro Obregon and Tlahuac. Bottom row from left-hand to right-hand: the fit for the diagnosed cases of the Districts Tlalpan and Xochimilco.

https://doi.org/10.1371/journal.pone.0263367.g008

thumbnail
Fig 9. Fit of confirmed COVID-19 cases of the boroughs 13 to 16 using the Stan package [61].

Top row from left-hand to right-hand: the fit for the confirmed COVID-19 cases of the Districts Benito Juarez and Cuauhtemoc. The tomato colour bars reprent the confirmed COVID-19 cases, the blue solid line reprent the median and the shaded area represent the %95 probability bands for the expected value for the state variable of Documented Infecteds. Bottom row from left-hand to right-hand: the fit for the diagnosed cases of the Districts Miguel Hidalgo and Venustiano Carranza.

https://doi.org/10.1371/journal.pone.0263367.g009

thumbnail
Fig 10. Credible intervals of parameters of model (1) within 95% Highest-Posterior Density (HPD) using the t-walk package [62].

https://doi.org/10.1371/journal.pone.0263367.g010

thumbnail
Table 9. Parameter estimation of βs, βa, ρ, α, γ, θ and initial conditions of the model (1).

https://doi.org/10.1371/journal.pone.0263367.t009

thumbnail
Table 10. Over-dispersion parameters estimation of the Negative Binomial distribution (6).

https://doi.org/10.1371/journal.pone.0263367.t010

The basic reproduction number estimation

The basic reproduction number, which is commonly denoted by , is the average number of secondary infections generated by a single infective during the curse of the infection in a whole susceptible population. We calculate the reproductive number in Mexico City using the inferred parameters. Define X = (E, A, I) and using the next generation operator method [66] on the system (1), the Jacobian matrices and of system (1) are given by The disease free equilibrium (DFE) of system (1) is X0 = (0, 0, 0, N, 0)T, we then have and Therefore, the next-generation matrix is K = FV−1, from where Re is computed as the leading eigenvalue of matrix K; that is, (8)

Table 8 shows the range of values for the parameters involved in the expression (8) obtained using the Stan package [61]. With those values, we obtain a %95 credible interval for .

Discussion

In this work, we analyse a networked dynamic metapopulation model of the coronavirus dissemination in Mexico City using ODEs and Bayesian statistics. We present an explanation of how to estimate the mobility per day between the boroughs that compound the Mexico City, both on a weekday and on weekends; combining available information from the origin-destination survey carried out in 2017 with the current mobility indices that Google and the government of Mexico City report, depending on the mode of transport used to make each trip (e.g., bus, subway, car, etc.) and then we use the Fratar method to balance the daily origin-destination matrices. We also present a clustering analysis of the boroughs which compound Mexico City based on mobility data from Google and the Transportation Mexican Survey. From Figs 3 and 4, we can identify three different clusters during the each phase of the pandemic. We point out that the same cluster analysis done for Mexico City, could be implemented for a broader area, the metropolitan area named Valle de Mexico, which is rather important for the whole country. We consider that this clustering analysis which is based on individual movement may be crucial to efficiently model a human pandemic on the same scale as presented here, or at a smaller scale.

From Fig 5, the transmission rate of symptomatic was 0.19 within 95% Credible Interval (CI) [0.06, 0.42], and the transmission rate of asymptomatic was 0.27 within 95% CI [0.14, 0.40], which is in concordance with the value estimated of 0.25 in [67], in that study the transmission rate was not separated in symptomatics and asyntomatics. The fraction of undocumented infections, ρ, was 0.027 within 95% CI [0.02, 0.04]. The estimated latency period, 1/α, is 5.96 days within 95% CI [3.60, 8.93] days, which is in concordance with the value used of 5.99, 6.0, 5.1 and 5.0 in [50, 52, 68, 69], and the estimated recovery period, 1/γ, is 4.86 days within 95% CI [3.15, 9.33] days, which is lower in comparison with the ones used of 5.97 and 10.81 (asymptomatic and symptomatic class, respectively) in [69], 10.0 in [52], 7.0 in [50] and 5.0, 10.8 and 14.0 (reported infectious, symptomatic and asymptomatic class, respectively) in [68]. The inter-borough scale factor θ was 0.77 within 95% CI [0.61, 0.89], this value indicates that the mean number of trips made by a person is between three and four in one day, which makes sense with complete trips to get out of home, do some activities (e.g., work, shopping, or services), and return home. The results of the inferred parameters of model (1) and the population size of the boroughs (e.g., Iztapalapa and Gustavo A. Madero) help to explain the fast dispersion of COVID-19 and indicate the challenge of finding strategies to contain it. We have compared the parameter values inferred with respect to those used for Mexico City. As mentioned in Section, we will analyse the identifiability of the parameters of model (1) (i.e., the ρ parameter) because this parameter is multiplied by the period of incubation of the disease, α. Thus, estimating both parameters simultaneously may lead to a non-identifiability difficulty. We may observe this non-identifiability in Figs 5 and 10; that is, different combinations of the model parameters lead to the same “energy” value of the system 1. In particular, we can observe that a different combination of the estimated parameters values obtained with the methods ADVI-Fullrank and ADVI-Meanfield give very similar fitted curves for diagnosed cases. We also observe that the recovery period time is more in accordance with the values used in [50, 52, 68, 69] but the latency period is lower than the ones used there. We note that the parameters, βs, βa of the model (1) are considered as global; that is, they are assumed the same for all the boroughs of Mexico City and all the transportation modes. In the near future, we will explore a more robust model that will consist of local parameters of transmission βs, βa, instead of globally (i.e., a pair of transmission rates βs, βa for each borough). Furthermore, we will consider those transmission rates, βs, βa, to be dependent on time as in [52, 70]. In addition, we will consider the interstate and international mobility from/to Mexico City. We will take into account imported cases from the other 31 states of Mexico. We will also consider the cases imported from overseas by airplane passengers, and will do a global and local sensitivity analysis of model (1). Finally, we will investigate a spatio-temporal model based on a diffusion partial differential equation model combined with individual movement trends.

References

  1. 1. World Health Organization. Novel Coronavirus (2019-nCoV) Situation Reports; 2020. https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation.
  2. 2. Paules CI, Marston HD, Fauci AS. Coronavirus infections more than just the common cold. Jama. 2020;323(8):707–708. pmid:31971553
  3. 3. Izquierdo LD, et al. Informe técnico nuevo coronavirus 2019-nCoV. Instituto de Salud Carlos III; 2020. Available from: https://www.mscbs.gob.es/va/profesionales/saludPublica/ccayes/alertasActual/nCov/documentos/20200210_ITCoronavirus.pdf.
  4. 4. Chen TM, Rui J, Wang QP, Zhao ZY, Cui JA, Yin L. A mathematical model for simulating the phase-based transmissibility of a novel coronavirus. Infectious Diseases of Poverty. 2020;9(1):1–8. pmid:32111262
  5. 5. Choi S, Ki M. Estimating the reproductive number and the outbreak size of novel coronavirus disease (COVID-19) using mathematical model in Republic of Korea. Epidemiology and Health. 2020; p. 123–145.
  6. 6. Yang C, Wang J. A mathematical model for the novel coronavirus epidemic in Wuhan, China. Mathematical Biosciences and Engineering. 2020;17(3):2708–2724. pmid:32233562
  7. 7. Savi PV, Savi MA, Borges B. A Mathematical Description of the Dynamics of Coronavirus Disease (COVID-10): A Case Study of Brazil. arXiv preprint arXiv:200403495. 2020.
  8. 8. Shaikh AS, Shaikh IN, Nisar KS. A Mathematical Model of COVID-19 Using Fractional Derivative: Outbreak in India with Dynamics of Transmission and Control. Preprints. 2020.
  9. 9. Yang W, Zhang D, Peng L, Zhuge C, Hong L. Rational evaluation of various epidemic models based on the COVID-19 data of China. arXiv preprint arXiv:200305666. 2020.
  10. 10. Nesteruk I. Statistics-based predictions of coronavirus epidemic spreading in mainland China. medRxiv. 2020.
  11. 11. Zhong L, Mu L, Li J, Wang J, Yin Z, Liu D. Early prediction of the 2019 novel coronavirus outbreak in the mainland china based on simple mathematical model. Ieee Access. 2020;8:51761–51769. pmid:32391240
  12. 12. Zhao Z, Zhu YZ, Xu JW, Hu QQ, Lei Z, Rui J, et al. A mathematical model for estimating the age-specific transmissibility of a novel coronavirus. medRxiv. 2020.
  13. 13. Miller A, Foti N, Lewnard J, Jewell N, Guestrin C, Fox E. Mobility trends provide a leading indicator of changes in SARS-CoV-2 transmission. medRxiv. 2020.
  14. 14. Pei S, Shaman J. Initial simulation of SARS-CoV2 spread and intervention effects in the continental US. medRxiv. 2020.
  15. 15. Keeling MJ, Rohani P. Modeling infectious diseases in humans and animals. Princeton University Press; 2011.
  16. 16. Keeling M, Rohani P. Estimating spatial coupling in epidemiological systems: a mechanistic approach. Ecology Letters. 2002;5:20–29.
  17. 17. Ma Z. Spatiotemporal fluctuation scaling law and metapopulation modeling of the novel coronavirus (COVID-19) and SARS outbreaks. arXiv preprint arXiv:200303714. 2020.
  18. 18. Calvetti D, Hoover A, Rose J, Somersalo E. Metapopulation network models for understanding, predicting and managing the coronavirus disease COVID-19. arXiv preprint arXiv:200506137. 2020.
  19. 19. Wells K, Lurgi M. COVID-19 containment policies through time may cost more lives at metapopulation level. medRxiv. 2020.
  20. 20. Coletti P, Libin P, Petrof O, Willem L, Steven A, Herzog SA, et al. A data-driven metapopulation model for the Belgian COVID-19 epidemic: assessing the impact of lockdown and exit strategies. medRxiv. 2020.
  21. 21. Li R, Pei S, Chen B, Song Y, Zhang T, Yang W, et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science. 2020;368(6490):489–493. pmid:32179701
  22. 22. Zhou C. Evaluating new evidence in the early dynamics of the novel coronavirus COVID-19 outbreak in Wuhan, China with real time domestic traffic and potential asymptomatic transmissions. medRxiv. 2020.
  23. 23. de Mexico G. Covid-19 Mexico; 2020. https://coronavirus.gob.mx/datos/.
  24. 24. INEGI. Encuesta Origen Destino en Hogares de la Zona Metropolitana del Valle de México (EOD) 2017; 2017. https://www.inegi.org.mx/programas/eod/2017/.
  25. 25. Google. Informes de movilidad local sobre el COVID-19; 2020. https://www.google.com/covid19/mobility/.
  26. 26. de la Ciudad de México G. Afluencia preliminar en transporte público; 2020. https://datos.cdmx.gob.mx/dataset/afluencia-preliminar-en-transporte-publico.
  27. 27. McBryde E, Gibson G, Pettitt A, Zhang Y, Zhao B, McElwain D. Bayesian modelling of an epidemic of severe acute respiratory syndrome. Bulletin of mathematical biology. 2006;68(4):889–917. pmid:16802088
  28. 28. Nour M, Cömert Z, Polat K. A novel medical diagnosis model for COVID-19 infection detection based on deep features and Bayesian optimization. Applied Soft Computing. 2020; p. 106580. pmid:32837453
  29. 29. Şenel K, Özdinç M, Öztürkcan DS, Akgül A. Instantaneous R for COVID-19 in Turkey: estimation by Bayesian statistical inference [Türkiye’de COVID-19 için anlık R hesaplaması: Bayesyen istatistiksel çıkarım ile tahmin]. Turkiye Klinikleri Journal of Medical Sciences. 2020;40(2):127–131.
  30. 30. Bagal DK, Rath A, Barua A, Patnaik D. Estimating the parameters of susceptible-infected-recovered model of COVID-19 cases in India during lockdown periods. Chaos, Solitons & Fractals. 2020;140:110154. pmid:32834642
  31. 31. Liu F, Li X, Zhu G. Using the contact network model and Metropolis-Hastings sampling to reconstruct the COVID-19 spread on the “Diamond Princess”. Science Bulletin. 2020;. pmid:32373394
  32. 32. Batistela CM, Correa DP, Buenoc ÁM, Piqueira JR. Compartmental model with loss of immunity: analysis and parameters estimation for Covid-19. arXiv preprint arXiv:200701295. 2020.
  33. 33. Prieto K. Current forecast of COVID-19: Bayesian and Machine Learning approaches. medRxiv. 2020.
  34. 34. Worldometer. Worldometer Coronavirus Updates; 2020. https://www.worldometers.info/.
  35. 35. Pei S, Kandula S, Yang W, Shaman J. Forecasting the spatial transmission of influenza in the United States. Proceedings of the National Academy of Sciences. 2018;115(11):2752–2757. pmid:29483256
  36. 36. Chávez Hernández MV, Juárez Valencia LH, Ríos Solís YA. Penalization and augmented Lagrangian for O-D Demand Matrix Estimation from Transit Segment Counts. Transportmetrica A: Transport Science. 2019;15(2):915–943.
  37. 37. Spiess H. A maximum likelihood model for estimating origin-destination matrices. Transportation Research Part B: Methodological. 1987;21(5):395–412.
  38. 38. INEGI. Instituto Nacional de Estadística y Geografía; 2020. https://www.inegi.org.mx/temas/estructura/default.html#Publicaciones/.
  39. 39. Chávez Hernández M. V. (2014). Modelos matemáticos para análisis de demanda en transporte. Mathematics Department, UAM-Iztapalapa.
  40. 40. de Anda-Jáuregui G. COVID-19 in Mexico: A network of epidemics. arXiv e-prints. 2020.
  41. 41. Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal. 2006;Complex Systems:1695.
  42. 42. Stojanović O, Leugering J, Pipa G, Ghozzi S, Ullrich A. A Bayesian Monte Monte Carlo approach for predicting the spread of infectious diseases. PLoS ONE. 2019;14(12). pmid:31851680
  43. 43. Luzyanina T, Bocharov G. Markov chain Monte Carlo parameter estimation of the ODE compartmental cell growth model. Mathematical Biology and Bioinformatics. 2018;13:376–391.
  44. 44. Brown G, Porter A, Oleson J, Hinman J. Approximate Bayesian computation for spatial SEIR(S) epidemic models. Spatial and Spatio’temporal Epidemiology. 2018;24(10):2685–2697. pmid:29413712
  45. 45. Bettencourt L, Ribeiro R. Real Time Bayesian Estimation of the Epidemic Potential of Emerging Infectious Diseases. PlosOne. 2008;3(5):e2185. pmid:18478118
  46. 46. Boersch-Supan P, Ryan S, Johnson L. deBInfer:Bayesian inference for dynamical models of biological systems in R. Methods in Ecology and Evolution. 2017;8:511–518.
  47. 47. Grinsztajn L, Semenova E, Margossian C, Riou J. Bayesian workflow for disease transmission modeling in Stan. arXiv e-prints. 2020.
  48. 48. Bliznashki S. A Bayesian Logistic Growth Model for the Spread of COVID-19 in New York. medRxiv. 2020;14(12).
  49. 49. Chowell G. Fitting dynamic models to epidemic outbreak with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecasts. Infectious Disease Modelling. 2017;2:379–398. pmid:29250607
  50. 50. Capistrán M, Capella A, Christen A. Forecasting hospital demand during COVID-19 pandemic outbreaks. arXiv e-prints. 2020.
  51. 51. Argüedas Y, Santana-Cibrian M, Velasco-Hernández J. Transmission dynamics of acute respiratory diseases in a population structured by age. Mathematical Biosciences and Engineering. 2019;16(6):7477–7493. pmid:31698624
  52. 52. Acuña Zegarra M, Comas-García A, Hernández-Vargas E, Santana-Cibrian M, Velasco-Hernández J. The SARS-CoV-2 epidemic outbreak: a review of plausible scenarios of containment and mitigation for Mexico. medRxiv. 2020.
  53. 53. Chatzilena A, Leeuwen E, Ratmann O, Baguelin M, Demiris N. Contemporary statistical inference for infectious disease models using Stan. Epidemics. 2019;29. pmid:31591003
  54. 54. Prieto K and Dorn O. Sparsity and level set regularization for diffuse optical tomography using a transport model in 2D. Inverse Problems. 2016;33(1):014001.
  55. 55. Prieto K and Ibarguen-Mondragon E. Parameter estimation, sensitivity and control strategies analysis in the spread of influenza in Mexico. Journal of Physics: Conference Series. 2019;1408(1):012020.
  56. 56. Smirnova A, DeCamp L, Liu H. In: Inverse problems and ebola virus disease using an age of infection model. Springer,Cham; 2016. p. 103–121.
  57. 57. Alavez-Ramirez J. Estimacion de parámetros en ecuaciones diferenciales ordinarias: identificabilidad y aplicaciones a medicina. Revista electrónica de contenido matemático. 2007;21.
  58. 58. Capistrán M, Moreles M, Lara B. Parameter estimation of some epidemic models: The case of recurrent epidemics caused by respiratory syncytial virus. Bulletin of Mathematical Biology. 2009;71(8):1890–1901. pmid:19568727
  59. 59. Nayens T and Faes C and Molenberghs G. A generalized Poisson-gamma model for spatially oversdispersed data. Spatial and Spatio-temporal Epidemiology. 2012;3:185–194.
  60. 60. Coly S, Yao A, Abrial D, Garrido M. Disributions to model overdispersed count data. Journal de la Societe Francaise de Statistique. 2016;157(2):39–63.
  61. 61. Carpenter B, Gelman A, Hoffman D, Goodrich B, Betancourt M, Brubaker M, et al. Stan: A probabilistic programming language. Journal of Statistical Software. 2017;76(1):1–32.
  62. 62. Christen J, Fox C. A general purpose sampling algorithm for continuous distributions (the t-walk). Bayesian Anal. 2010;5:263–282.
  63. 63. House T, Ford A, Lan S, Bilson S, Buckingham-Jeffery E, Girolami M. Bayesian uncertainty quantification for transmissibility of influenza, norovirus and Ebola using information geometry. Journal of the Royal Society Interface. 2016;13. pmid:27558850
  64. 64. Roosa K, Chowell G. Assesing parameter identifiability in compartmental dynamic models using a computational approach: application to infectious disease transmission models. Theoretical Biology and Medical Modelling. 2019;16(1). pmid:30642334
  65. 65. Magal P, Webb G. The parameter identification problem for SIR epidemic models: Identifying unreported cases. Journal of Mathematical Biology. 2018;77:1629–1648. pmid:29330615
  66. 66. Diekmann O, Heesterbeek JAP, Metz JA. On the definition and the computation of the basic reproduction ratio R0 in models for infectious diseases in heterogeneous populations. Journal of mathematical biology. 1990;28(4):365–382. pmid:2117040
  67. 67. Dbouk T, Drikakis D. Fluid dynamics and epidemiology: Seasonality and transmission dynamics. Physics of Fluids. 2021;33(2):021901. pmid:33746486
  68. 68. Saldaña F, Velasco-Hernández J. The trade-off between mobility and vaccination for COVID-19 control: a metapopulation modeling approach. medRxiv. 2020.
  69. 69. Acuña Zegarra M, Santana-Cibrian M, Velasco-Hernández J. Modeling behavioral change and COVID-19 containment in Mexico: A trade-off between lockdown and compliance. Mathematical Biosciences. 2020;325. pmid:32387384
  70. 70. Piccolomini E, Zama F. Monitoring Italian COVID-19 spread by a forced SEIRD model. PLoS ONE. 2020;15(8):e0237417.