The transmission of SARS-CoV-2 is likely comodulated by temperature and by relative humidity

Inferring the impact of climate upon the transmission of SARS-CoV-2 has been confounded by variability in testing, unknown disease introduction rates, and changing weather. Here we present a data model that accounts for dynamic testing rates and variations in disease introduction rates. We apply this model to data from Colombia, whose varied and seasonless climate, central port of entry, and swift, centralized response to the COVID-19 pandemic present an opportune environment for assessing the impact of climate factors on the spread of COVID-19. We observe strong attenuation of transmission in climates with sustained daily temperatures above 30 degrees Celsius and simultaneous mean relative humidity below 78%, with outbreaks occurring at high humidity even where the temperature is high. We hypothesize that temperature and relative humidity comodulate the infectivity of SARS-CoV-2 within respiratory droplets.

The detection rate describes the total probability of detecting a case of SARS-CoV-2 374 in a given population. This rate undoubtedly increased over the period covered by our 375 analysis since the testing rate increased by over two orders of magnitude during that 376 time (Fig 1). In section detection rate scaling, we show that the detection rate 377 introduces an additive factor into the exponent of the expected number of confirmed 378 cases. 379 In section drip rate scaling, we show that the drip rate, or the rate at which infected 380 travelers arrive into a city, introduces a multiplicative constant into the total number of 381 infectious hosts in a given city which translates to a multiplicative constant in the 382 expected number of confirmed cases. 383 In section testing is driven by perception, we demonstrate that the testing rate is not 384 equivalent to the detection rate, since the testing rate is driven by human decisions about going to the hospital, ordering tests, and testing protocols, whereas the detection 386 rate describes the final probability of detecting a case of COVID-19. As an example, we 387 show that panic can drive a wave of non-infected people to receive tests, driving up the 388 testing rate, without changing the overall detection rate (subsection panic in a disease 389 free population). 390 We make a distinction between the testing rate and the detection rate, because we 391 use the observed exponential growth in the testing rate as motivation for including the 392 detection rate in our data model, but do not introduce the testing data into our data 393 analysis as a constraint, because our algorithm deduces the detection rate from the data 394 without the need for constraints. Furthermore, our analysis shows that we should not 395 expect the testing rate and the detection rate to have more than qualitative agreement. 396 In section count log-velocity, we show that the log-velocity, or the time-derivative of 397 the log of the expected number of confirmed cases, is a useful measure of disease spread 398 because it is independent of the drip rate and the overall detection rate. Since the 399 log-velocity still depends on the detection rate dynamics, we include a logistic model for 400 detection (section drip model ).
Here the detection rate converges to a final rate p f with rate k and reaches half the 413 final rate on day h. In the early stage of the pandemic, while the detection capacity is a 414 small fraction of its later capacity, we have t < h and exp {−k(t − h)} >> 1. Then, to 415 first order: The log of the expected number of cases on day t is then: Eq 10 shows that the log of expected number of counts increases linearly with rate k. 418 Thus, at the beginning of testing ramp-up, constant infections with logistic growth in the 419 detection rate is indistinguishable from exponential growth in the number of infections 420 with a constant detection rate. To see this, simply substitute N = N e rt for the number 421 of infections and substitute p(t) = p f for the detection rate. Then Since p f is initially unknown, the observed case dynamics will appear equivalent 423 when the number of infections is constant and the detection rate grows logistically versus 424 when the number of infections grows exponentially and the detection rate is constant. 425 drip rate scaling 426 In the previous section we illustrated that the local detection rate introduces an 427 additive factor into the apparent transmission rate of the disease. In this subsection, we 428 demonstrate that the drip rate introduces a multiplicative factor into the total number 429 of infections in a given location under very general assumptions. We only assume that 430 in the early stage of the disease, the disease hosts are weakly-interacting.

431
Under the weakly-interacting disease host assumption, the infected disease travelers 432 who arrive in a given city create new pockets of infection. Since we assume that the 433 disease hosts are weakly-interacting, a city with twice the drip rate will have twice the 434 number of pockets in which the disease grows. As long as the weakly-interacting 435 assumption is true, these pockets grow independently at the same rate, with no overlap. 436 The total number of infections will then be the sum of infections over each of these 437 disease pockets. In this way, the drip rate introduces a multiplicative factor into the 438 expected number of infections regardless of the details of how the disease spreads. 439 We quantify this intuition in the following way. Let each person in a given location 440 (i.e. city) be assigned a unique number 1, 2, . . . N for N total people. Let and so on, each of these sets becoming larger and larger as the pool 460 of potential infectees becomes larger and larger. We refer to each stage of disease 461 transmission as a generation. That is, the people in the set Λ τ jt , each infected by 462 traveler τ jt between day t and t, are the first generation of infectees. Then the people 463 that these infectees infect are the second generation, and so on.

464
For the early stages of the disease, we need not consider very many generations. The 465 generation timescale is 1/r and so the time for the disease to progress to the k-th 466 generation is 1/r k . For r << 1, this time quickly becomes very large. infections that the average traveler spreads per day. Then the total number of infections 490 on day t is roughly: The expected value of this this quantity is easily seen to be: Here we see that a multiplicative constant has been introduced. Note that we have 493 not introduced a specific model for how the disease spreads. We have only assumed that 494 the disease spreads independently.

495
From this we conclude that the total number of cases in a given city does not 496 necessarily inform us about the propensity of the disease to spread in the environment 497 of the city, especially for short timescales and in cities where the transmission rate is 498 low. As we will show in the next section, the dynamics of the count numbers tell us 499 more about the transmission rate than the absolute scale of the number of infections at 500 any time. 501 count log-velocity 502 So far we have only considered (k), the rate at which the disease detection probability 503 increases, and the drip rate (I), the rate at which the disease is introduced into a city. 504 We have not yet considered the total probability of detecting a given case of COVID-19 505 at a given moment in time.

506
Consider two cities, city 1 and city 2, that are alike in all respects except for the 507 overall detection rate p f . Also assume that the disease transmits in both cities at the 508 same rate (r = r 1 = r 2 ), the disease is introduced into the two cities at the same rate 509 (I = I 1 = I 2 ), and the disease detection infrastructure grows at the same rate It is not necessary to specify the model of the disease dynamics, as long as the 512 assumption that the disease spreads independently from each infected person is true.

513
There may be other parameters of the model that we do not specify here. Whatever 514 these may be, they are assumed to be equal for the two cities. Likewise, we do not 515 specify a model of the detection rate increase in the two cities, we only assume that it is 516 some function f (t, k) such that 517 There are no restrictions on f other than it depend upon some rate k parameter 518 that governs the rate of change of p [t]. The function f (t, k) could depend upon other 519 parameters which we have not specified but which are assumed equal between the two 520 cities.

521
The probability of detecting c cases of COVID-19 in either city is a binomial and 522 thus the expected number of counts is (from Eq 15): These three steps are not independent. For example, a severely symptomatic host is 534 more likely to go to the hospital and request attention than a weakly symptomatic host 535 or an asymptomatic host. Likewise, a severely symptomatic host is more likely to have a 536 doctor order a test and more likely to test positive for COVID-19 than a moderately 537 symptomatic patient. times the probability of being tested. Since being tested requires going to the hospital, 541 we separate the probabilities by the product rule: This division is essential because it divides the dynamics into two distinct  Note that the number of tests conducted on a given day is not, in general, equal to 548 the number of unique tests conducted on a given day. This is because most countries 549 implement protocols that call for duplicate testing. For now, we will not consider 550 duplicate testing since we are only interested in the basic formulation and scaling of the 551 testing dynamics.

552
Here p(H | t) denotes the probability of going to the hospital with COVID-related 553 symptoms, which for now, are assumed to be the only requirements for a symptomatic 554 S1 Table. Symptoms grades for testing model.  symptoms grade  description  symptoms  0  asymptomatic  none  1  mild  common-cold like symptoms  2 severe pneumonia-like symptoms, respiratory distress person to be tested. We expand the probability of going to the hospital in terms of 555 those who are infected (i) and those who are not infected (ī) with SARS-CoV-2: those 556 who are not infected include those who have other infections, e.g. influenza.
Note that many people who are not infected with SARS-CoV-2 will have symptoms 558 consistent with COVID-19. These symptoms may result from (1) other respiratory 559 viruses (2) other health problems and (3) mass psychogenic illness (Table in S1 Table). 560 It is well documented in psychology literature that the suggestion of symptoms provokes 561 562 symptoms in a large percentage of the population, particularly when group consensus is involved [1]. 563 We further expand each of the hospital terms in Eq 18 in terms of the severity of the 564 symptoms. We use a three-grade ranking of symptoms: asymptomatic (0), mildly 565 symptomatic (1) and severely symptomatic (2) (see Table in S1 Table). We assume that 566 only symptomatic patients show up at the hospital as COVID-19 testing candidates, 567 and that the general population dynamics is driven by self-assessed symptoms (such as 568 difficulty breathing and fatigue).
The term h ) denotes the probability of going to 570 the hospital given grade one or grade two symptoms. Here p(H | 1, i, t) = p (H | 1, t).

571
This is because the probability of going to the hospital cannot depend on the true 572 disease status as the decision to go to the hospital is subjective: the individual is not 573 certain of his/her actual disease status.
Combining and re-arranging terms:  Eq 21 shows that the general population dynamics is driven by two terms: an 580 infected sub-population term and an uninfected sub-population term. Interestingly, the 581 infected sub-population term is driven by the fraction of people infected in the total 582 population, p i [t], and the difference in symptom rate between those who are infected i 583 and those who are not infectedī. That is, if the rate of moderate and severe symptoms 584 between those who are infected and those who are not infected were the same, the 585 number of people who show up at the hospital would not change as the disease spread. 586 Whereas patient perception drives general population dynamics, physician 587 perception drives hospital dynamics. We now expand the probability of being tested in 588 terms of the severity of symptoms under the assumption that the physician does not 589 administer tests to asymptomatic patients and always administers tests to severely 590 symptomatic patients: 1, H, t). In other words, f 2 592 is the fraction of people at the hospital that the physician diagnoses as having grade two 593 symptoms, f 1 is the fraction of people at the hospital that the physician diagnoses as 594 having grade one symptoms and T 1 [t] is the probability of administering a test to a 595 grade one symptom patient (as diagnosed by the physician).

596
Note that we are suppressing an important condition here for the sake of compact 597 notation. Here p(2 | H, t), for example, denotes the fraction of people at the hospital 598 with grade two symptoms according to the physician's perception. We might denote this 599 with a P for emphasis -e.g. p(2 | H, t) = p(2 | H, t, P ). We emphasize this distinction 600 because it is tempting to apply Bayes' theorem to simplify these expressions, but doing 601 so requires care and keeping track of the conditions that we have largely suppressed in 602 this brief presentation. 603 We note two interesting cases. When there is no disease within the population, then p i [t] = 0 for all times considered. 606 In that case, the probability of going to the hospital among the general population is 607 (from Eq 21): A rough upper bound order of magnitude estimate for the probabilities p(1 |ī, t) and 609 p(2 |ī, t) is about 1% and 0.1% respectively. Consider that during a typical flu season, 610 over 50 million Americans contract the flu. For a population of 350 million, that is one 611 in seven. Given that peak flu season spans about three months, then order 0.1% of the 612 population is infected per day (new infections) during the flu season. If we make the 613 approximation that all infections are symptomatic and the symptoms last for five days, 614 then about 0.5% of the population will have symptoms only from influenza consistent 615 with COVID-19 on a given day of peak flu season. Further note that roughly 10% of the 616 population goes to the hospital per year, or about 0.02% per day assuming a uniform 617 distribution. Thus, if we have h 1 ∼ 0.01 and h 2 ∼ 0.1 we obtain the right order of 618 magnitude for people going to the hospital.

619
The small order of these numbers is important because a jump in h 1 [t] or h 2 [t] or in 620 the probability of perceiving symptoms (p(1 |ī, t) and p(2 |ī, t)) can provoke a hospital 621 rush. Recall that h 1 [t] is the probability of going to the hospital with moderate 622 symptoms. As a crude but simplifying assumption, we make the approximation that in 623 a panic, all people with moderate or severe symptoms will go to the hospital: h 1 and h 2 624 will both be of O(1). This, in the panic caused by the introduction of a new disease, the fraction of people in the general population who experience moderate symptoms will make a large jump. In numerous studies in the literature, mass psychogenic illness (MPI) has been shown to arise and give rise to real, measurable symptoms, even when there is no underlying disease. This phenomenon is more prevalent in women than men (particularly young women). There were several instances of young women spreading panic on social media in the USA when they did not have the disease (see, for example, [2]). Here p G ∼ 1/N << p 0 since the psychogenic symptoms usually begin with a single 635 case. This model is consistent with numerous case studies that show a rapid spread of 636 symptoms among peer-connected groups. In some cases, an entire factory or an entire 637 military base is crippled overnight by the spread of spurious symptoms. Indeed, the 638 only factor slowing down the spread of psychogenic symptoms is (1) the delay in the 639 transmission of information among susceptibles and (2) the number of people in social 640 groups, as proximity of a symptomatic appears to be a key factor in triggering 641 psychogenic symptoms.

642
For simplicity, we are limiting psychogenic symptoms to moderate symptoms. We 643 assume that for all times considered, including t = 0, p(1 |ī, t) >> p(2 |ī, t), or that 644 cases with moderate symptoms greatly outnumber cases with severe symptoms. Then to 645 first order, the probability of going to the hospital is: The total number of unique tests on day t is then: Recall that T 1 [t] is the probability of a test being administered to a person with 648 moderate symptoms. We assume that under a panic, T 1 [t] → 1. In such a situation, the 649 testing rate becomes where F [t] is the total fraction of the hospital patients that the attending physicians 651 perceive as having symptoms (moderate or severe) consistent with COVID-19. In the 652 early stages of a panic, F [t] must be rapidly increasing, or perhaps constant if the 653 medical staff is particularly stoic. In any event, F [t] is non-decreasing. Thus, MPI can 654 cause an exponential rise in the testing rate, even when there is no disease in the 655 population. In such a situation, for any significant false positive rate, the number of 656 positive test results will also exhibit exponential growth.

657
ignorance among a diseased population 658 Now consider the opposite case: a disease spreads rapidly throughout a population who 659 fail to recognize the outbreak. Assume exponential growth of the disease prevalence in 660 the population p i [t] = p 0 e rt and assume that people with moderate symptoms do not go 661 to the hospital h 1 [t] = 0. Since there is no awareness of disease spread, h 2 [t] = h 2 is a 662 constant. Then the probability of going to the hospital is: We assume that p 0 << p(2 |ī) so that at t = 0 the vast majority of people with 664 severe symptoms do not have the disease. Then we can consider two times, t << T and 665 t >> T with T = r −1 ln(p(2 |ī)/p 0 ). In the first case, t << T , the fraction of people 666 going to the hospital is constant at ∼ h 2 p(2 |ī). Then for times t >> T , the fraction of 667 people going to the hospital grows exponentially ∼ h 2 p 0 ∆p 2 [t]e r(t−T ) .

668
Since there is no change in disease awareness, the fraction of the people showing up 669 at the hospital getting tested is a constant. That is p(T | H, t) = F 0 . Then for times 670 t << T the fraction of the population tested is: Here p[0] = p[1] = 0 since we only count infections that are infectious for at least one 708 full day and T M is the maximum infection duration. Note also that we are assuming 709 that the travelers are recently infected. From Eq 32, we see that the fraction c on day t 710 is defined as the ratio of the number of people removed on day t to the total number of 711 infections on day t − 1. That is From Eq 35 we observe the dependence of c upon the transmission rate r, the time t, 713 and the distribution of removal times p[t ]. 714 We estimate the initial order of c as follows. First, we approximate the distribution 715 p in Eq 34 by a delta function. That is, we suppose that p[T ] = 1 (i.e. all infections are 716 resolved on day T ). Then on day t = T , the first removals leave the infectious pool. The 717 fraction of people that leave can be approximated as the ratio of the expected number This equation is only valid at t = T . If c << r, then this estimate will remain valid for future times. Using estimates on the transmission characteristics of SARS-CoV-2 by one of our coauthors, we obtain an estimate for T = 15.5 days [3]. With this long infectious period, the impact of including recovery dynamics on the early spread within Colombia is not significant (S5 Fig).  Fig. S1 Fig. COVID-19 and testing in Bogotá D.C. In the right panel, the national testing rate (dashed red, log base 10) and inferred detection capacity for Bogotá D.C. only (black). Note we have plotted the smoothed national testing rate in dashed red; this plot is on a log10 scale and only serves to show qualitative agreement between the inferred detection rate and the daily testing rate. The national testing rate includes testing for all cities in Colombia.
S2 Fig. Python code for conducting the fits. The first line grabs the number of days from the length of the input. The second line allocates memory for the error matrix based upon the number of parameters for the transmission rate (r) and the detection parameters (k,h). The next three lines compute the ranges of the parameters considered based upon input. The subsequent loops iterate over all model parameters in the following order: r, then k, then h. After each r, we compute the drip model for the infection log velocity via the function "analytic". From this we obtain the detection log velocity (or the empirical detection curve) by subtraction since the infection log velocity and the detection log velocity are additive in the count log velocity. In the detection loop, we compute the logistic curve for each (k,h) pair and subtract this from the detection log velocity to obtain the model noise. We compute the error as the log of the square of the noise. The log is required because the curve asymptotes to values near zero.