Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Predictors of incident viral symptoms ascertained in the era of COVID-19

  • Gregory M. Marcus ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Project administration, Writing – original draft, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • Jeffrey E. Olgin,

    Roles Data curation, Funding acquisition, Methodology, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • Noah D. Peyser,

    Roles Data curation, Methodology, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • Eric Vittinghoff,

    Roles Formal analysis, Methodology, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • Vivian Yang,

    Roles Data curation, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • Sean Joyce,

    Roles Data curation, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • Robert Avram,

    Roles Data curation, Formal analysis, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • Geoffrey H. Tison,

    Roles Data curation, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • David Wen,

    Roles Software, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • Xochitl Butcher,

    Roles Data curation, Methodology, Software, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • Helena Eitel,

    Roles Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America

  • Mark J. Pletcher

    Roles Data curation, Funding acquisition, Methodology, Writing – review & editing

    Affiliation Division of Cardiology, Department of Medicine, University of California, San Francisco, California, United States of America



In the absence of universal testing, effective therapies, or vaccines, identifying risk factors for viral infection, particularly readily modifiable exposures and behaviors, is required to identify effective strategies against viral infection and transmission.


We conducted a world-wide mobile application-based prospective cohort study available to English speaking adults with a smartphone. We collected self-reported characteristics, exposures, and behaviors, as well as smartphone-based geolocation data. Our main outcome was incident symptoms of viral infection, defined as fevers and chills plus one other symptom previously shown to occur with SARS-CoV-2 infection, determined by daily surveys.


Among 14, 335 participants residing in all 50 US states and 93 different countries followed for a median 21 days (IQR 10–26 days), 424 (3%) developed incident viral symptoms. In pooled multivariable logistic regression models, female biological sex (odds ratio [OR] 1.75, 95% CI 1.39–2.20, p<0.001), anemia (OR 1.45, 95% CI 1.16–1.81, p = 0.001), hypertension (OR 1.35, 95% CI 1.08–1.68, p = 0.007), cigarette smoking in the last 30 days (OR 1.86, 95% CI 1.35–2.55, p<0.001), any viral symptoms among household members 6–12 days prior (OR 2.06, 95% CI 1.67–2.55, p<0.001), and the maximum number of individuals the participant interacted with within 6 feet in the past 6–12 days (OR 1.15, 95% CI 1.06–1.25, p<0.001) were each associated with a higher risk of developing viral symptoms. Conversely, a higher subjective social status (OR 0.87, 95% CI 0.83–0.93, p<0.001), at least weekly exercise (OR 0.57, 95% CI 0.47–0.70, p<0.001), and sanitizing one’s phone (OR 0.79, 95% CI 0.63–0.99, p = 0.037) were each associated with a lower risk of developing viral symptoms.


While several immutable characteristics were associated with the risk of developing viral symptoms, multiple immediately modifiable exposures and habits that influence risk were also observed, potentially identifying readily accessible strategies to mitigate risk in the COVID-19 era.


The global SARS-CoV-2 pandemic has affected communities in every habitable continent and every state in the US. Given what is generally known about respiratory viruses, strategies to mitigate transmission have included government orders to practice regular hand hygiene, physical distancing, including the closures of public locations commonly associated with community gatherings, and more recently to wear masks [13] Studies thus far have largely focused on comparisons among those seeking medical care for the disease [47] or evaluations of large administrative datasets [8]. It can be difficult to track individual-level characteristics and behaviors, particularly as they are dynamic and changing over time, as they relate to incident disease. Members of the public may benefit by understanding strategies under their direct control that may influence their own risk of infection and viral transmission.

Tracking viral infection is hindered by the absence of universal and repeated testing. In the absence of such testing, recent evidence suggests that symptoms themselves may be useful markers of SARS-CoV-2 infection [9]. While the virus may be asymptomatic, a variety of symptom clusters associated with the disease have been identified, often including fever, but ranging from typical respiratory symptoms to gastrointestinal afflictions to somewhat idiosyncratic findings such as anosmia/ageusia and conjunctivitis [1018]. In the past, ascertainment of viral symptoms has relied on assessments of those seeking medical care or retrospective surveys that may be prone to recall bias. Given the current near-ubiquity of smartphones and use of related mobile apps, technology is now available to regularly and repeatedly query large numbers of individuals over time, providing access to symptom development as it arises. Although monitoring for viral symptoms may be neither sufficiently sensitive nor specific for SARS-CoV-2 infection, these outcomes are, by their nature, inherently experienced by the individual, potentially providing valuable information that may be best leveraged by modern mobile technology.

We sought to use prospectively collected information about exposures and modifiable behaviors, along with daily symptom reporting, to identify risk factors for incident viral symptoms using a globally-available, smartphone mobile application-based study, the COVID-19 Citizen Science Study.


We launched the COVID-19 Citizen Science Study, a mobile application-based study compatible with Android or iOS operating systems, on March 26, 2020. The mobile application was built by investigators and developers at the University of California, San Francisco, using the NIH-supported Eureka digital and mobile research platform. Enrollment is open to any adult with a smartphone, and study information has been broadcast via press release, social media, and to participants in the Eureka-based Health eHeart Study. Participants were encouraged to recruit additional individuals. Updated study information, including number of participants, maps of symptom clusters, and location of participants around the world can be found at All participation is remote, without geographic restriction. Verification of cell phone numbers via text was required before proceeding from study registration to remote-based study consent and subsequent study participation. Retention strategies include daily notifications, data visualizations, and intermittent study update blog posts. Our Citizen Scientist participants contribute study question ideas that are then included into the study and reported to participants as participant-generated.

Participants complete surveys written in lay language meeting Flesch-Kincaid criteria for an 8th grade reading level ( At baseline, surveys collected information about demographics, education, occupation, SARS-CoV-2 status (referred to in surveys as “the novel coronavirus, the virus that causes COVID-19”), behaviors, living conditions, attitudes regarding the SARS-CoV-2 pandemic, local government restrictions related to the disease, medical conditions, and medications (the surveys are included in the S1 File). Questionnaires included pre-tested survey instruments employed in multiple Health eHeart Study and Eureka Digital Research Platform studies for demographics and past medical conditions; however, as the study was launched during the beginning of the SARS-CoV-2 pandemic, questionnaires specific to related risk factors and viral symptoms were new. Perceived socioeconomic status was assessed using the MacArthur subjective social status ladder [19,20]. All participants received an optional invitation to share their smartphone-based geolocation data.

Participants receive a daily survey, timed to occur synchronously to their local same time of day when they engaged with the first baseline survey, via mobile application-based push notification. The daily survey includes queries about current viral symptoms, updated according to new information, using “check all that apply” including: “A scratchy throat”; “A cough (worse than usual if you have a baseline cough)”; “A painful sore throat”; “A temperature greater than 100.4°F or 38.0°C”; “A runny nose”; “Symptoms of fever or chills”; “Muscle aches (worse than usual if you have baseline muscle aches)”; “Shortness of breath”; “Nausea, vomiting or diarrhea” (added March 30, 2020); “Unable to taste or smell” (added March 31, 2020); “Red or painful eyes” (added April 13, 2020); or “none of the above.” The daily survey then includes questions regarding current symptoms among household contacts and the number of individuals outside the household the participant interacted with within six feet (about 1.83 meters) in the previous 24 hours.

Participants received weekly surveys to update information regarding sleep, exercise, hand hygiene, social and physical distancing behaviors, habits such alcohol consumption, and SARS-CoV-2 infection status. All surveys remained open for 24 hours.

For those that consented to geolocation tracking, smartphone-based geolocation using a combination of the Global Positioning System and cell phone tower triangulation was collected every 5 minutes for Android phones and whenever the phone accelerometer exhibited movement (in order to minimize battery drain) for iOS smartphones. Geolocation latitude and longitude coordinates were clustered within an individual for every day of the study using the HDBScan clustering algorithm [21,22]. The most prevalent cluster of geolocation coordinates for a user was defined as “home.” Time spent at a cluster was calculated as the time difference between the current location cluster and any future cluster change. Daily time spent at home was calculated as the time spent at the cluster identified as “home” divided by the time between the first and last coordinate collected daily. In addition, distance travelled was calculated as the sum of the successive distances between all consecutive coordinates collected within a user on a daily basis. Long-distance travel was defined as movement of at least 1,000 kilometers within 24 hours.

Occupation was dichotomized into healthcare workers versus not; sleep was determined as the average number of hours per day over each week; exercise, defined as physical activity for at least 20 minutes that resulted in breathing heavily or to “break a sweat,” was dichotomized into more or less than once weekly; alcohol was assessed as average daily standard drinks; cigarette, e-cigarette, and marijuana use were dichotomized into any use in the last 30 days versus not; household symptoms were dichotomized into any versus none in the previous 6–12 days; and the maximum number of contacts within six feet (about 1.83 meters) reported in the previous 6–12 days were derived from the daily survey responses. The lag time of 6–12 days was employed to allow for the incubation period of SARS-CoV-2 [23] and other common respiratory viruses [24] and in order to accommodate days without responses and the expectation that viral symptoms would last several days.

For the current analyses, all participants reporting a previous positive test for SARS-CoV-2 and those with any symptoms upon entry to the study were excluded. Those with baseline medical conditions that might themselves contribute to the symptoms of interest, including atrial fibrillation, coronary artery disease, congestive heart failure, chronic obstructive pulmonary disease, and asthma, were excluded. Two sensitivity analyses were conducted: one also excluded all participants reporting anemia; a second included participants with atrial fibrillation, coronary artery disease, congestive heart failure, chronic obstructive pulmonary disease, and asthma. Based on the daily surveys, incident viral symptoms were defined as the first report of a combination of fever or chills plus at least one other symptom on the same day. The absence of completion of a daily survey was assumed to represent an absence of symptoms in statistical analyses. Given the protean manifestations of COVID-19 [5,1018,25], we allowed for any of these viral symptoms, requiring fever given that each of those symptoms could possibly occur in the absence of a viral infection. Follow-up for the current study ended May 3, 2020. The study was approved by the University of California, San Francisco Institutional Review Board. All participants provided informed electronic consent.

Statistical analyses

Normally distributed continuous variables are presented as means ± SD and compared using t-tests, where continuous variables with skewed distributions are presented as medians with interquartile ranges (IQR) and compared using Wilcoxon rank sum tests. Categorical variables were compared using chi-squared tests. Pooled logistic regression models were used to identify factors associated with incident symptoms, potentially including baseline characteristics (demographics, medical conditions, habits, and behaviors related to viral infection risk such as hand hygiene), and time-updated information from daily and weekly surveys. Exposures that expected to influence the risk of viral infection that would then manifest as future symptoms several days later were evaluated using survey data for 6–12 days earlier. Consequently, only participants with at least one daily survey at least 6 days after the first were included in the pooled logistic regression models. Beginning with the subset of variables associated with incident symptoms at p < 0.1 in pooled logistic regression models adjusting only for age, sex, race, and calendar date (with linear and non-linear components), backward deletion was used to select multivariable models retaining covariates with p < 0.05. Statistical analyses were performed using Stata, version 16 (College Station, TX). Two-tailed p-values < 0.05 were considered statistically significant.


After exclusions were applied, 14,335 participants were available and contributed to the incident analyses. Differences between these participants and those that entered the study reporting at least one viral symptom are shown in Table 1. Participants resided in all 50 states and in 93 countries outside the US. While a mean 42% ± 12% of all participants completed the daily survey each day, 95%-100% of all participants completed at least one daily survey per week throughout the study period, and weekly surveys were completed 66 ± 26% of the time (S1 Table).

Table 1. Baseline characteristics of participants with and without prevalent viral symptoms.

Over a median follow-up of 21 days (IQR 10–26 days), 424 (3%) participants developed incident viral symptoms. S2 Table shows the specific symptoms reported. Fig 1 illustrates the locations of participants with and without symptoms. Fig 2 provides a sample summary of enrollment, survey completion, symptom development, and follow-up over time.

Fig 1. Location of study participants.

Blue shading represents gradations of the number of participant-days within the US by county (left) and in the world by nation (right). Red shading depicts the number of symptomatic participants by location. Created with software provided by Tableau (; San Francisco, CA) and published with their permission under the Creative Commons Attribution License (CC BY 4.0).

Fig 2. Heat map of symptomatic and sample of asymptomatic patients displaying time of enrollment, survey completion, time of symptom development, and follow-up.

The left plot depicts participants that developed symptoms. The right plot depicts participants who did not develop symptoms matched in a one-to-one fashion with each symptomatic case by time of enrollment. Each row represents a unique study participant (n = 424 for each plot). The X-axis represents days of the current study. Blue = weekly survey completed (the first blue represents the enrollment visit). Green = daily survey completed (the daily survey contents are included in the weekly survey). Red = symptoms developed. Black = after development of symptoms. White = no data entry prior to or in the absence of symptoms.

In minimally adjusted logistic models adjusting only for age, sex, race, ethnicity, and date, a higher level of education and subjective social scale, exercising at least once weekly, a longer average sleep duration, and sanitizing one’s phone were each associated with a lower risk while a history of anemia, hypertension or some immunodeficiency, cigarette smoking, e-cigarette use, marijuana use, having pets at home, having household members with viral symptoms, and the number of individuals with which the participant interacted with within six feet (about 1.83 meters) each predicted a higher risk of incident viral symptoms (Table 2). Pertinent characteristics that failed to exhibit statistically significant relationships included HIV status, hand washing practices, reported government restrictions, and, per geolocation measurements, amount of time at home and daily distance traveled. In the backwards stepwise logistic model, the following were retained: a higher level of subjective social status, exercise, and sanitizing one’s phone were each associated with a significantly lower risk of developing viral symptoms, whereas female sex, a history of anemia, hypertension, recent cigarette smoking, recent household contacts with viral symptoms, and the maximum number of individuals recently in contact with the participant within six feet (about 1.83 meters) were each associated with a significantly heightened risk of developing viral symptoms (Table 3).

In sensitivity analyses, excluding all participants with anemia did not meaningful change the results (S3 Table). After including participants with atrial fibrillation, coronary artery disease, congestive heart failure, chronic obstructive pulmonary disease, and asthma: those with congestive heart failure and anemia exhibited a higher risk for incident symptoms, and all of the previously observed statistically significant relationships remained (S4 Table).


Among an international cohort involving collection of prospective and time-updated data, female sex, anemia, hypertension, recent cigarette smoking, living with someone with viral symptoms, and the maximum number of recent contacts within six feet (about 1.83 meters) outside the home each predicted a higher risk of developing viral symptoms. Conversely, a higher self-perceived social status, regular exercise, and sanitizing one’s phone were each associated with a lower risk of subsequently reporting viral symptoms.

As of July 25, 2020, there were more than 15 million confirmed cases of SARS-CoV-2 infections and more than 640,000 COVID-19-related deaths around the world [26]. In response, tremendous investment and efforts are being dedicated to enhance the availability of testing [27], identify effective therapies [28], and ultimately to develop a vaccine [29]. Although these remedies are being pursued at an unprecedented pace, the number of infections and deaths continues to grow, and, even after these new technologies, drugs, and vaccines are developed, additional time will be required to disseminate them. While studies of hospitalized patients are valuable, ultimately the characteristics, behaviors, and exposures of individuals in the general community associated with the development of viral symptoms can be helpful in several ways: to identify those at highest risk of developing these symptoms, which may help prioritize protecting the most vulnerable; to provide novel insights regarding the biology of current viral disease; and ideally to identify low-risk and modifiable behaviors that individuals might practice or avoid to reduce their own individual-level risk.

Clearly, self-reported symptoms of a viral infection do not equal SARS-CoV-2 infection. However, the specific viral symptoms repeatedly queried in our population were developed based on available evidence regarding the nature of SARS-CoV-2 infection [1018], and similar viral diseases, including the common cold and influenza for example, very likely share common properties related to an individual’s susceptibility to infection [30]. In addition, independent of commonalities across easily transmissible viral diseases, there are shared phenomena related to the human immune system’s general vulnerability to viral infection [31]. Finally, due to the lack of universal testing, there is some evidence that surveillance for viral symptoms may itself have several advantages, often reflecting, either directly or indirectly, underlying COVID-19 disease [9,32,33].

The higher incidence of viral symptoms among women in our cohort runs counter to the prevailing evidence that men are at higher risk of both SARS-CoV-2 infection and related morbidity and mortality [34,35]. Indeed, population-based studies examining seropositivity for antibodies against SARS-CoV-2 have generally failed to demonstrate any differences by biological sex [36,37]. While this may suggest the viral symptoms detected by the current study fail to capture patterns most relevant to SARS-CoV-2, these differences may have occurred because our models adjusted for related mediators (such as smoking) [38] or because women are either more likely to experience or report more mild symptoms. Previous community-based studies often adjust or weight for sex distributions based on the population, which can often hinder a direct assessment of biological sex as a predictor itself [33,39]. Anemia and hypertension as risk factors are consistent with the general notion that other systemic comorbidities enhance the susceptibility to viral infection. While anemia may be a marker of general, non-specific disease, hypertension, not generally considered a risk factor for infectious diseases, has emerged as a consistent predictor of SARS-CoV-2 infection and associated complications [47]. The reasons for this are unclear, although angiotensin-converting-enzyme-2 (ACE2)-dependent cellular entry of the virus has been posited as a biologically plausible mechanism, assuming some connection with ACE2-dysregulation that is also associated with hypertension [40,41]. While optimizing blood pressure control reduces overall morbidity and mortality [42], the consequences on incident viral infection have not yet been fully elucidated.

Three at least theoretically readily modifiable exposures, each bolstered by biological plausibility and previous evidence, arose as risk factors for incident viral infection: smoking, household contacts, and the maximum number of recent interactions with other individuals within six feet (about 1.83 meters). The impact of smoking on the risk of SARS-CoV-2 has been difficult to study, as reliance on hospitalization data fails to provide a foundational study base to make comparisons. Conflicting data on the subject exists, with some evidence that smokers may have a lower risk of SARS-CoV-2 infection and other evidence they may experience a higher risk [43]—the great majority of these studies rely on data among those already infected rather than providing information from cohort studies that include a geographically heterogenous cohort such as ours. Smoking may reduce the effectiveness of the immune response and may also upregulate ACE2, rendering individuals more prone to infection [44]. Or this observation may be related to an overall greater propensity to the symptoms of interest in the current study, rather than SARS-CoV-2 infection per se. The observation that sick household contacts predicted incident symptoms may provide evidence that these symptoms were in fact often due to a transmissible disease. For example, although the current analyses excluded those with prevalent symptoms (which reduces the chance symptoms arose from some chronic, ever-present, problem), shared symptoms within a household may have represented some common exposure or predisposition, such as an allergy—however, household symptoms preceding participant symptoms arose as a statistically significant predictor of incident symptoms, supporting viral infections as a culprit. Although physical distancing as a method to mitigate spread of infection is supported by the general understanding of the nature of infectious diseases, particularly respiratory viruses, our observation from prospective, repeatedly updated, individual-level data that the number of human to human physical interactions predicted viral symptoms may provide useful evidence in support of physical distancing.

Protective factors included a higher subjective social status, at least weekly exercise, and sanitizing one’s phone. We utilized the MacArthur subjective social status ladder as a validated single-item question to capture socioeconomic status [19,20]. A higher self-perceived social status may influence viral infection risk in several ways: more education may translate into a better understanding of disease risks and healthy behaviors, and employment among those of a higher socioeconomic status may be more flexible and less often involve high-risk environments. Conversely, stress and the allostatic load related to social determinants of health among those with a lower subjective social status may adversely affect the immune response to infection [45]. Regular exercise is an established means to improve immune function and the response to viral infection [46], now with evidence for beneficial effects specifically in the COVID-19 era. We recognize that sanitizing one’s phone as a protective factor may simply serve as a marker of more fastidious behaviors to minimize risk in general, but the ubiquity and frequent use of the smartphone, likely while shopping, while at work, and while interacting with others, would seem to make it a potentially potent fomite that could result in repeated exposure throughout the day and into the home.

Although we examined multiple predictors of incident viral symptoms, we do not believe that adjustment for multiple hypothesis testing would be appropriate for several reasons. First, all of our predictors had biological plausibility. Second, all covariates were adjusted for one-another in our multivariable models. Third, beyond such mutual adjustment, we employed a backward stepwise elimination of covariates to only retain those achieving our prespecified statistical significance. Of note, the majority of statistically significant findings did exhibit particularly small p values that would have withstood even the most conservative multiple hypothesis testing adjustment (such as Bonferroni), but, for the reasons described above, we do not believe this would be appropriate as it would risk reclassifying true positives as false positives. These are also standard and well-accepted approaches to this sort of analysis [24].

Our study has several important limitations. The outcome of interest was viral symptoms, which relied on self-report. These findings therefore do not directly reflect any particular disease, including infection with SARS-CoV-2. Indeed, while we selected fever as a required symptom in hopes of capturing infection and at least one other symptom known to be associated with SARS-CoV-2 to help with specificity, the symptoms described are more applicable to respiratory viruses in general and could also occur outside the realm of infection. Importantly, we were able to leverage the prospective nature of our study with repeated assessments and exclude those with prevalent symptoms at baseline, which should help mitigate against contamination by those with chronic and non-infectious conditions. Some investigators have proposed and validated the creation of symptom-based scores for the accurate inference of COVID-19 [47], and future similar efforts may be able to harness the relative accessibility of patient or research participant self-report without reliance on biological assays. Multiple studies have sought to identify particular symptoms or combinations of symptoms most indicative of SARS-CoV-2 infection [14,25,32,33,47]—while the symptoms we employed generally match those associated with the disease in these studies, variability in populations and study design currently preclude the identification of an optimal approach to strike the intended balance regarding sensitivity versus specificity in such longitudinal cohort studies. Indeed, there is evidence the evolving nature of COVID-19 may produce variable manifestations over time, potentially hindering attempts to accurately characterize a true diagnosis by specific symptoms alone [48]. It is important to emphasize that we incorporated information into the study as it arose from the medical literature, such as (as described in the methods) gastrointestinal symptoms, loss of taste and smell, and eye symptoms, soon after they were reported in reputable sources—therefore, not every symptoms was assessed at baseline, which could have led to under-ascertainment of SARS-CoV-2-related symptoms in some. However, those updates were incorporated into both baseline and daily surveys for all participants as they arose. As the study required smartphone use, it is possible our population represents a more technically savvy and perhaps more highly educated and affluent group than the general population. However, this would primarily limit generalizability and should not serve as a threat to internal validity. We assumed that the absence of a completed daily survey represented an absence of symptoms, which may have resulted in a loss of power to detect some relationships but addressed missing data in a fashion that did not risk creating spurious false positive associations. In addition, the participants were fairly geographically diverse, representing every state in the US and multiple countries. Although less than 80% of the study participants were non-Hispanic white, African American representation was relatively poor. Finally, although the data were collected prospectively and in a time-updated fashion, the study was observational, prone to residual and unmeasured confounding that should temper assumptions of causal effects.

In conclusion, female sex, anemia, hypertension, recent cigarette smoking, living with someone with recent viral symptoms, and the maximum number of recent contacts within six feet (about 1.83 meters) outside the home each predicted a higher risk of developing viral symptoms during the current COVID-19 pandemic. At the same time, a higher subjective social status, regular exercise, and sanitizing one’s phone each predicted a lower risk of developing viral symptoms.

Supporting information

S1 Table. Proportion of participants completing at least one daily survey per week and the proportion completing weekly surveys.


S2 Table. Specific symptoms reported in addition to fevers and chills.


S3 Table. Sensitivity analysis of independent predictors of incident symptoms excluding anemia.

Derived backwards stepwise elimination of covariates (see Methods). * overall heterogeneity. † heterogeneity of non-reference levels. # linear trend.


S4 Table. Sensitivity analysis of independent predictors of incident symptoms including participants with atrial fibrillation, coronary artery disease, congestive heart failure, chronic obstructive pulmonary disease, and asthma.

Derived backwards stepwise elimination of covariates (see Methods). * overall heterogeneity. † heterogeneity of non-reference levels. # linear trend.



  1. 1. McAnulty JM, Ward K. Suppressing the Epidemic in New South Wales. N Engl J Med 2020;382:e74. pmid:32383832
  2. 2. Wang J, Pan L, Tang S, Ji JS, Shi X. Mask use during COVID-19: A risk adjusted strategy. Environ Pollut 2020;266:115099. pmid:32623270
  3. 3. Islam N, Sharp SJ, Chowell G, et al. Physical distancing interventions and incidence of coronavirus disease 2019: natural experiment in 149 countries. BMJ 2020;370:m2743. pmid:32669358
  4. 4. Price-Haywood EG, Burton J, Fort D, Seoane L. Hospitalization and Mortality among Black Patients and White Patients with COVID-19. N Engl J Med 2020;382:2534–43. pmid:32459916
  5. 5. Goyal P, Choi JJ, Pinheiro LC, et al. Clinical Characteristics of COVID-19 in New York City. N Engl J Med 2020;382:2372–4. pmid:32302078
  6. 6. Bhatraju PK, Ghassemieh BJ, Nichols M, et al. COVID-19 in Critically Ill Patients in the Seattle Region—Case Series. N Engl J Med 2020;382:2012–22. pmid:32227758
  7. 7. Richardson S, Hirsch JS, Narasimhan M, et al. Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area. JAMA 2020.
  8. 8. Williamson EJ, Walker AJ, Bhaskaran K, et al. OpenSAFELY: factors associated with COVID-19 death in 17 million patients. Nature 2020.
  9. 9. Silverman JD, Hupert N, Washburne AD. Using influenza surveillance networks to estimate state-specific prevalence of SARS-CoV-2 in the United States. Sci Transl Med 2020. pmid:32571980
  10. 10. Martín-Sánchez FJ, Del Toro E, Cardassay E, et al. Clinical presentation and outcome across age categories among patients with COVID-19 admitted to a Spanish Emergency Department. Eur Geriatr Med 2020. pmid:32671732
  11. 11. Huang C, Wang Y, Li X, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020;395:497–506. pmid:31986264
  12. 12. Liu J, Cui M, Yang T, Yao P. Correlation between gastrointestinal symptoms and disease severity in patients with COVID-19: a systematic review and meta-analysis. BMJ Open Gastroenterol 2020;7. pmid:32665397
  13. 13. Adorni F, Prinelli F, Bianchi F, et al. Self-reported symptoms of SARS-CoV-2 infection in a non-hospitalized population: results from the large Italian web-based EPICOVID19 cross-sectional survey. JMIR Public Health Surveill 2020.
  14. 14. Lan FY, Filler R, Mathew S, et al. COVID-19 symptoms predictive of healthcare workers’ SARS-CoV-2 PCR results. PLoS One 2020;15:e0235460. pmid:32589687
  15. 15. Gangaputra SS, Patel SN. Ocular Symptoms among Nonhospitalized Patients Who Underwent COVID-19 Testing. Ophthalmology 2020. pmid:32585259
  16. 16. Grant MC, Geoghegan L, Arbyn M, et al. The prevalence of symptoms in 24,410 adults infected by the novel coronavirus (SARS-CoV-2; COVID-19): A systematic review and meta-analysis of 148 studies from 9 countries. PLoS One 2020;15:e0234765. pmid:32574165
  17. 17. Dawson P, Rabold EM, Laws RL, et al. Loss of Taste and Smell as Distinguishing Symptoms of COVID-19. Clin Infect Dis 2020.
  18. 18. Scalinci SZ, Trovato Battagliola E. Conjunctivitis can be the only presenting sign and symptom of COVID-19. IDCases 2020;20:e00774. pmid:32373467
  19. 19. Goodman E, Adler NE, Kawachi I, Frazier AL, Huang B, Colditz GA. Adolescents’ perceptions of social status: development and evaluation of a new indicator. Pediatrics 2001;108:E31. pmid:11483841
  20. 20. Cooper DC, Milic MS, Mills PJ, Bardwell WA, Ziegler MG, Dimsdale JE. Endothelial function: the impact of objective and subjective socioeconomic status on flow-mediated dilation. Ann Behav Med 2010;39:222–31. pmid:20376585
  21. 21. Melvin RL, Xiao J, Godwin RC, Berenhaut KS, Salsbury FR Jr. Visualizing correlated motion with HDBSCAN clustering. Protein Sci 2018;27:62–75. pmid:28799290
  22. 22. Castro Gertrudes J, Zimek A, Sander J, Campello R. A unified view of density-based methods for semi-supervised clustering and classification. Data Min Knowl Discov 2019;33:1894–952. pmid:32831623
  23. 23. Alene M, Yismaw L, Assemie MA, Ketema DB, Gietaneh W, Birhan TY. Serial interval and incubation period of COVID-19: a systematic review and meta-analysis. BMC Infect Dis 2021;21:257. pmid:33706702
  24. 24. Lessler J, Brookmeyer R, Reich NG, Nelson KE, Cummings DA, Perl TM. Identifying the probable timing and setting of respiratory virus infections. Infect Control Hosp Epidemiol 2010;31:809–15. pmid:20569117
  25. 25. Tostmann A, Bradley J, Bousema T, et al. Strong associations and moderate predictive value of early symptoms for SARS-CoV-2 test positivity among healthcare workers, the Netherlands, March 2020. Euro Surveill 2020;25. pmid:32347200
  26. 26. Center for Systems Science and Engineering at Johns Hopkins University. COVID-19 dashboard. ( 2020.
  27. 27. Tromberg BJ, Schwetz TA, Pérez-Stable EJ, et al. Rapid Scaling Up of COVID-19 Diagnostic Testing in the United States—The NIH RADx Initiative. N Engl J Med 2020. pmid:32706958
  28. 28. Wang D, Li Z, Liu Y. An overview of the safety, clinical application and antiviral research of the COVID-19 therapeutics. J Infect Public Health 2020.
  29. 29. COVID-19 therapies and vaccine landscape. Nature materials 2020;19:809. pmid:32704138
  30. 30. Kutter JS, Spronken MI, Fraaij PL, Fouchier RA, Herfst S. Transmission routes of respiratory viruses among humans. Curr Opin Virol 2018;28:142–51. pmid:29452994
  31. 31. Rouse BT, Sehrawat S. Immunity and immunopathology to viruses: what decides the outcome? Nat Rev Immunol 2010;10:514–26. pmid:20577268
  32. 32. Wynants L, Van Calster B, Collins GS, et al. Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal. BMJ 2020;369:m1328. pmid:32265220
  33. 33. Menni C, Valdes AM, Freidin MB, et al. Real-time tracking of self-reported symptoms to predict potential COVID-19. Nat Med 2020;26:1037–40. pmid:32393804
  34. 34. Williamson EJ, Walker AJ, Bhaskaran K, et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature 2020;584:430–6. pmid:32640463
  35. 35. Peckham H, de Gruijter NM, Raine C, et al. Male sex identified by global COVID-19 meta-analysis as a risk factor for death and ITU admission. Nat Commun 2020;11:6317. pmid:33298944
  36. 36. Pagani G, Conti F, Giacomelli A, et al. Seroprevalence of SARS-CoV-2 significantly varies with age: Preliminary results from a mass population screening. J Infect 2020;81:e10–e2. pmid:32961253
  37. 37. Pollan M, Perez-Gomez B, Pastor-Barriuso R, et al. Prevalence of SARS-CoV-2 in Spain (ENE-COVID): a nationwide, population-based seroepidemiological study. Lancet 2020;396:535–44. pmid:32645347
  38. 38. Cai H. Sex difference and smoking predisposition in patients with COVID-19. The Lancet Respiratory medicine 2020;8:e20. pmid:32171067
  39. 39. Menachemi N, Yiannoutsos CT, Dixon BE, et al. Population Point Prevalence of SARS-CoV-2 Infection Based on a Statewide Random Sample—Indiana, April 25–29, 2020. MMWR Morb Mortal Wkly Rep 2020;69:960–4. pmid:32701938
  40. 40. Gold MS, Sehayek D, Gabrielli S, Zhang X, McCusker C, Ben-Shoshan M. COVID-19 and comorbidities: a systematic review and meta-analysis. Postgrad Med 2020:1–7. pmid:32573311
  41. 41. Liu PP, Blet A, Smyth D, Li H. The Science Underlying COVID-19: Implications for the Cardiovascular System. Circulation 2020;142:68–78. pmid:32293910
  42. 42. Benjamin EJ, Blaha MJ, Chiuve SE, et al. Heart Disease and Stroke Statistics-2017 Update: A Report From the American Heart Association. Circulation 2017;135:e146–e603. pmid:28122885
  43. 43. Haddad C, Bou Malhab S, Sacre H, Salameh P. Smoking and COVID-19: A Scoping Review. Tob Use Insights 2021;14:1179173X21994612. pmid:33642886
  44. 44. Farsalinos K, Barbouni A, Poulas K, Polosa R, Caponnetto P, Niaura R. Current smoking, former smoking, and adverse outcome among hospitalized COVID-19 patients: a systematic review and meta-analysis. Ther Adv Chronic Dis 2020;11:2040622320935765. pmid:32637059
  45. 45. Juster RP, Sindi S, Marin MF, et al. A clinical allostatic load index is associated with burnout symptoms and hypocortisolemic profiles in healthy workers. Psychoneuroendocrinology 2011;36:797–805. pmid:21129851
  46. 46. Simpson RJ, Kunz H, Agha N, Graff R. Exercise and the Regulation of Immune Functions. Prog Mol Biol Transl Sci 2015;135:355–80. pmid:26477922
  47. 47. Bastiani L, Fortunato L, Pieroni S, et al. Rapid COVID-19 Screening Based on Self-Reported Symptoms: Psychometric Assessment and Validation of the EPICOVID19 Short Diagnostic Scale. J Med Internet Res 2021;23:e23897. pmid:33320825
  48. 48. Lan FY, Filler R, Mathew S, et al. Evolving virulence? Decreasing COVID-19 complications among Massachusetts healthcare workers: a cohort study. Pathog Glob Health 2021;115:4–6. pmid:33191880