Using mobile phone data to estimate dynamic population changes and improve the understanding of a pandemic: A case study in Andorra

Compartmental models are often used to understand and predict the progression of an infectious disease such as COVID-19. The most basic of these models consider the total population of a region to be closed. Many incorporate human mobility into their transmission dynamics, usually based on static and aggregated data. However, mobility can change dramatically during a global pandemic as seen with COVID-19, making static data unsuitable. Recently, large mobility datasets derived from mobile devices have been used, along with COVID-19 infections data, to better understand the relationship between mobility and COVID-19. However, studies to date have relied on data that represent only a fraction of their target populations, and the data from mobile devices have been used for measuring mobility within the study region, without considering changes to the population as people enter and leave the region. This work presents a unique case study in Andorra, with comprehensive datasets that include telecoms data covering 100% of mobile subscribers in the country, and results from a serology testing program that more than 90% of the population voluntarily participated in. We use the telecoms data to both measure mobility within the country and to provide a real-time census of people entering, leaving and remaining in the country. We develop multiple SEIR (compartmental) models parameterized on these metrics and show how dynamic population metrics can improve the models. We find that total daily trips did not have predictive value in the SEIR models while country entrances did. As a secondary contribution of this work, we show how Andorra’s serology testing program was likely impacted by people leaving the country. Overall, this case study suggests how using mobile phone data to measure dynamic population changes could improve studies that rely on more commonly used mobility metrics and the overall understanding of a pandemic.


A.2 Home parish inference
The parish-level populations inferred from the telecoms data for May 2020 were compared to published 2020 population statistics [1]. There is a Pearson correlation coefficient of 0.959 (p<0.001), suggesting that the telecoms data are representative of the true population. A. 3 Comparing entrances and infection rates between Andorra, France, and Spain The SEIR model (iii) in this work incorporates trips and entrances data to model transmission rates. It includes an assumption that the likelihood of new country entrants being infectious tracks with the timeline of infection rates in Andorra. In order to check this assumption, we compare the rates of COVID-19 between the country of Andorra and its bordering neighbors, Spain and France, during our period of study. This is done by comparing the timeline of reported deaths per 1 million residents, where data is smoothed over a 7 day rolling window. Death reports are used instead of case reports as a more stable comparison indicator in this study and others because death reports were considered to be less impacted than case reports by the dynamically changing testing procedures which varied by country [2,3]. The Pearson correlation coefficients between Andorra and France and Andorra and Spain are 0.916 (p < 0.001) and 0.928 (p < 0.001), respectively. We note there was an error in the Spain data with negative deaths values in late May. We changed the negative values to 0 for this comparison. Without the change, the correlation was still statistically significant with Pearson correlation coefficient 0.918 (p < 0.001). The timeline of infections is shown in Fig A.2. During this same period, telecoms data showed that 86% of entrances by foreign SIMs were either Spanish or French, and 68% of all entrances were by Spanish or French SIMs when accounting for entrances by Andorran SIMs. The timeline of entrances by SIM nationality is shown in Fig A.3.

A.4 Serology tests and country departures
Note that the sum of survey participation varies due to missing values regarding the participants' parish of residence or temporary worker status. Two cross sectional serological surveys were conducted in Andorra from May 4-28, 2020, using a rapid serological test (nCOV IgG/IgM) [4]. For each participant, the test data include the dates of their participation in surveys 1 and/or 2, the positive versus negative results for IgG/IgM antibodies for each round of testing, the participant's parish of residence, whether the participant is a temporary worker, and other demographic information. Table A.3 shows the number of participants and seroprevalence from survey 1 as well how many participants from survey 1 also participated in survey 2. Data for temporary workers is highlighted. Seroprevalence data is from previously reported results and was calculated based on the number of individuals who had a positive result of IgG and/or IgM [4]. The testing was voluntary; an issue with the testing was that many people who participated in the first survey did not participate in the second survey. Seroprevalence was higher among temporary workers. At the same time, temporary workers who participated in survey 1 were less likely to participate in survey 2 versus the general population. See Table A We counted the number of mobile subscribers, by inferred home parish, who were in the country during the first and second survey periods (May 4-14 and May 18-28, 2020). Subscribers were counted as present during a survey period if they had at least one "stay" within the period. We estimated the rate at which subscribers left the country after the first test by counting how many subscribers were present during only the first period versus both periods.
These numbers were compared to the parish-level serology test participant populations. Namely, the portion of serology survey participants who did survey 1 but not survey 2 was compared to the estimated portion of mobile subscribers who left the country between survey periods. There is a statistically significant Pearson correlation coefficient of 0.937 (p=0.0019).
To further validate that the decline in test participation was related to people leaving the country, we repeated these tests using 2019 telecoms data. In this case, there is a Pearson correlation coefficient of 0.4928 (p=0.2612), which is not statistically significant. If the May 2020 subscribers had left the country for reasons not related to the pandemic, we would expect the correlation to be similar for the 2019 and 2020 data.
Optimal parameters were estimated by minimizing the negative log-likelihood function using the L-BFGS-B method with the Python SciPy library [5,6]. This step searches for optimal parameters by taking initial parameters which are then modified towards improved values, with specified bounds. The (minimum, maximum) bounds for γ were set to (1/10, 1/2). The bounds used for each of the parameters related to β -b0, b1, b2, b3, b4 -were (0, 2), with the observation that the parameters for the best fit models did not come up against these bounds. The (minimum, maximum) bounds for both E(0) and I(0) were set to (40, 4000), where t=0 corresponds to March 14, the first day of the training period. The values of (40, 4000) were conservatively set where the minimum was based on reported active cases, and the maximum based on reported cumulative cases. In addition, the training routine discarded models where E(0) and I(0) values differed by more than 3000.
Before describing how initial parameters were handled, first note that the optimizing function is not convex. To avoid the optimization function terminating at local minima, a grid search was used for the initial parameters. The same grid search method was used for each of the models.
In addition to the grid search routine, another step was taken to find optimal model fits: The models using the trips and entrances data were initially fit using spline approximations of these metrics, where the splines were estimated from the true metrics using knots spaced by 7 days. Fig A.4 shows the comparison of the true metrics versus their spline approximations. This step was taken to further smooth the data and ease the computational complexity of the model fitting routine. Without this improvement, the model training was slow and rarely resulted in successful outcomes, and the estimated parameters rarely varied from their initial values. After the models were fit using the spline approximations of the data, the models were finally fit again using the true data, where the parameters found via the fitting routine with the spline approximations of the data were used as the initial parameters in the L-BFGS-B method. Table A. 5 shows the values of the parameters for the best fit models that resulted from model training. Fig A.5 shows the corresponding time series values representing each of the best fit models. The values include R 0 , the compartment populations, and the predicted reported cases.

A.6 Fit model values and parameters
The best fit models were determined as those with the best log-likelihood score when fit over the training data, where the training data period was March 14 -May 31, 2020. See section A.8 for values from the robustness check.
Models were trained based on the standard SEIR model where: and is cumulative case reports and accounts for reporting delay, d, and reporting rate, r.    [7], γ −1 is estimated via model training, and d is set to 7 based on related work [2,8]. Reporting delays, d, can be due to the time it takes to seek a test, for the test to be processed, and then officially reported. Note that at the start of the pandemic in Andorra, tests were sent for processing to Spain, potentially adding extra time to reporting delays. A study in Singapore from March 2020 estimated an average reporting delay of 6.4 days (95% CI 5.8, 6.9) [8]. A reporting delay of d = 7 was implicitly assumed by Arroyo-Marioli et al. [2]. They estimated time series values for the effective reproduction number for 124 countries across the world and validated their work by correlating their estimates of R t to mobility data from the "COVID-19 Community Mobility Reports" collected by Google, where the lag between R t and mobility was 14 days (2 weeks). They assumed an SIR model rather than an SEIR model, with time from exposure to removed, γ −1 , of 7 days, implying a reporting delay of 7 days (14 − 7). We note that Arroyo-Marioli et al. produced time series estimates for R t in Andorra. However their estimates are not comparable to the R 0 estimates in this work because (a) they did not correct for the reporting error that caused an influx of 78 additional cases June 1-10 [9], as was done in this work, and (b) their estimates were for the effective reproduction number versus the basic reproduction number. We also note that Google's "COVID-19 Community Mobility Reports" are not available for Andorra.
The best fit for the model estimated γ −1 ∼ 5.5 days (γ = 0.18). Combined with σ −1 = 5.2, d = 7 results in a total estimated average delay from exposure to case report ∼17.7 days. This is consistent with previous work over a similar study period in Andorra that studied the correlation between mobility metrics and transmission rates over various lags and found the best correlations were with mobility metrics lagged by 18 days [10].

A.8 Robustness check
As a robustness check, we trained and tested all models over an additional set of training and testing periods that ended slightly earlier than those used for the main results. The training period for the robustness check was March 14 -May 14, 2020. The models were then tested on the period that directly followed this training period.