Patterns of COVID-19 testing and mortality by race and ethnicity among United States veterans: A nationwide cohort study

Background There is growing concern that racial and ethnic minority communities around the world are experiencing a disproportionate burden of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and coronavirus disease 2019 (COVID-19). We investigated racial and ethnic disparities in patterns of COVID-19 testing (i.e., who received testing and who tested positive) and subsequent mortality in the largest integrated healthcare system in the United States. Methods and findings This retrospective cohort study included 5,834,543 individuals receiving care in the US Department of Veterans Affairs; most (91%) were men, 74% were non-Hispanic White (White), 19% were non-Hispanic Black (Black), and 7% were Hispanic. We evaluated associations between race/ethnicity and receipt of COVID-19 testing, a positive test result, and 30-day mortality, with multivariable adjustment for a wide range of demographic and clinical characteristics including comorbid conditions, health behaviors, medication history, site of care, and urban versus rural residence. Between February 8 and July 22, 2020, 254,595 individuals were tested for COVID-19, of whom 16,317 tested positive and 1,057 died. Black individuals were more likely to be tested (rate per 1,000 individuals: 60.0, 95% CI 59.6–60.5) than Hispanic (52.7, 95% CI 52.1–53.4) and White individuals (38.6, 95% CI 38.4–38.7). While individuals from minority backgrounds were more likely to test positive (Black versus White: odds ratio [OR] 1.93, 95% CI 1.85–2.01, p < 0.001; Hispanic versus White: OR 1.84, 95% CI 1.74–1.94, p < 0.001), 30-day mortality did not differ by race/ethnicity (Black versus White: OR 0.97, 95% CI 0.80–1.17, p = 0.74; Hispanic versus White: OR 0.99, 95% CI 0.73–1.34, p = 0.94). The disparity between Black and White individuals in testing positive for COVID-19 was stronger in the Midwest (OR 2.66, 95% CI 2.41–2.95, p < 0.001) than the West (OR 1.24, 95% CI 1.11–1.39, p < 0.001). The disparity in testing positive for COVID-19 between Hispanic and White individuals was consistent across region, calendar time, and outbreak pattern. Study limitations include underrepresentation of women and a lack of detailed information on social determinants of health. Conclusions In this nationwide study, we found that Black and Hispanic individuals are experiencing an excess burden of SARS-CoV-2 infection not entirely explained by underlying medical conditions or where they live or receive care. There is an urgent need to proactively tailor strategies to contain and prevent further outbreaks in racial and ethnic minority communities.

a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 associations between race/ethnicity and receipt of COVID-19 testing, a positive test result, and 30-day mortality, with multivariable adjustment for a wide range of demographic and clinical characteristics including comorbid conditions, health behaviors, medication history, site of care, and urban versus rural residence. Between February 8 and July 22, 2020, 254,595 individuals were tested for COVID-19, of whom 16,317 tested positive and 1,057 died. Black individuals were more likely to be tested (rate per 1,000 individuals: 60.0, 95% CI 59.6-60. 5 39, p < 0.001). The disparity in testing positive for COVID-19 between Hispanic and White individuals was consistent across region, calendar time, and outbreak pattern. Study limitations include underrepresentation of women and a lack of detailed information on social determinants of health.

Conclusions
In this nationwide study, we found that Black and Hispanic individuals are experiencing an excess burden of SARS-CoV-2 infection not entirely explained by underlying medical conditions or where they live or receive care. There is an urgent need to proactively tailor strategies to contain and prevent further outbreaks in racial and ethnic minority communities.

Why was this study done?
• There is growing concern that racial and ethnic minority communities around the world are experiencing a disproportionate burden of morbidity and mortality from symptomatic severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection or coronavirus disease 2019 (COVID- 19).
• Most studies investigating racial and ethnic disparities to date have focused on those who tested positive for SARS-CoV-2 or hospitalized patients.
• No single study to our knowledge has yet investigated racial and ethnic disparities in testing patterns (i.e., who received testing and who tested positive) as well as COVID-19 outcomes in a nationwide cohort with adequate adjustment for potential confounders.
What did the researchers do and find?
• We used electronic health records from the largest integrated healthcare system in the US to investigate racial and ethnic disparities in testing and subsequent COVID-19 mortality.

Introduction
The United States has the highest number of reported symptomatic severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections and related deaths in the world, accounting for one-fourth of global totals as of July 22, 2020 [1]. There is growing concern that racial and ethnic minority communities are experiencing a disproportionate burden of morbidity and mortality from symptomatic SARS-CoV-2 infection or coronavirus disease 2019 (COVID-19) [2][3][4][5][6][7][8]. One statewide study investigating racial disparities followed 3,481 COVID-19 cases in Louisiana and found that non-Hispanic Black individuals represented 77% of hospitalizations and 71% of deaths despite only making up 31% of the total source population [9]. Thus, the potential for racial and ethnic disparities in COVID-19 have been deemed an urgent public health research priority [10]. However, most studies investigating racial and ethnic disparities have focused on hospitalized patients or have not characterized who received testing or tested positive for COVID-19 [9,[11][12][13][14][15]. Given that COVID-19 testing was not performed at random, particularly in the early phases of the pandemic, evaluating underlying testing patterns and changes over time may provide important context for interpreting findings from models of COVID-19 outcomes. In addition, it is not yet known whether disparities in COVID-19 infection or severe outcomes are explained, at least in part, by differences in underlying health conditions, smoking and alcohol use, geographic location, or urban versus rural residenceessential information if we are to design effective interventions. The electronic health record database of the Department of Veterans Affairs (VA) offers the single largest nationwide data resource available with the necessary information on systemwide testing and detailed medical histories to examine racial and ethnic disparities in the US. We evaluated associations between race/ethnicity and receipt of COVID-19 testing, a positive test result, and 30-day mortality, conditioning each analysis on the previous outcome and accounting for a wide range of demographic and clinical characteristics through July 22, 2020.

Data source
The VA is the largest integrated healthcare system in the US and comprises over 1,200 points of care (i.e., sites) nationwide including hospitals, medical centers, and community outpatient clinics. All care is recorded in an electronic health record with daily uploads into the VA Corporate Data Warehouse. Available data include demographics, outpatient and inpatient encounters, diagnoses, smoking and alcohol health behaviors, and pharmacy dispensing records.
This study was approved by the institutional review boards of VA Connecticut Healthcare System and Yale University. It has been granted a waiver of informed consent and is Health Insurance Portability and Accountability Act compliant. The analyses herein were not prespecified in a formal protocol, rather were informed by hypotheses drawn from prior work [16]. This study is reported as per the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines (S1 STROBE Checklist).

Sample, follow-up, and outcomes
All individuals in clinical care (defined as having at least 1 clinical encounter between January 1, 2018, and December 31, 2019, and alive as of January 1, 2020) were included in this analysis. We identified individuals tested for COVID-19 from the date of the VA's first recorded test, on February 8, 2020, through July 22, 2020, by using text searching of laboratory results containing terms consistent with SARS-CoV-2 or COVID-19. Nearly all tests utilized nasopharyngeal swabs; 1% were from other sources. Testing was performed in VA, state public health, and commercial reference laboratories using FDA Emergency Use Authorization-approved SARS-CoV-2 assays. We did not include antibody tests in this analysis.
If an individual had more than 1 test and all were negative, we selected the date of the first negative test; otherwise we used the date of the first positive test. Baseline for individuals tested for COVID-19 was defined as the date of specimen collection unless testing occurred during hospitalization, in which case baseline was defined as the date of admission. If the admission began more than 14 days prior to testing, which may indicate hospital-acquired infection, we set baseline to 14 days prior to testing to better capture health status prior to SARS-CoV-2 infection. We examined 3 outcomes: (1) receipt of COVID-19 testing among all in care, (2) receipt of a positive test result among individuals tested for COVID-19, and (3) 30-day mortality among COVID-19 cases. Deaths were ascertained using inpatient records and VA death registry data to capture deaths outside of hospitalization. The choice of 30-day mortality as the outcome was guided by the distribution of mortality events by time since testing positive for COVID-19 (50th, 75th, 90th percentile time to death: 12, 20, 30 days) and to allow for sufficient follow-up within the study period. While there were some deaths beyond 30 days after testing positive for COVID-19, we were less certain that these deaths could be attributed to COVID-19. Given the low number of deaths after 30 days, 30-day mortality may be a reasonable proxy for case fatality rate. However, until longer follow-up has accrued, it remains to be seen whether those who develop symptomatic COVID-19 experience longer term excess mortality.

Variables
The primary exposure variable was self-reported race/ethnicity (non-Hispanic White [White], non-Hispanic Black [Black], and Hispanic). Analyses of other racial and ethnic backgrounds were underpowered at the time of this analysis, and therefore individuals who self-reported race/ethnicity other than White, Black, or Hispanic were excluded from the study population.
We selected demographic and clinical characteristics that have been evaluated in prior COVID-19 reports and could potentially mediate or explain racial/ethnic disparities in COVID-19 positivity and mortality. Demographics included age at baseline, sex, and rural/ urban residence. Rural/urban residence was defined using geographic information system coding based upon established criteria [17]. Clinical characteristics were based on diagnostic codes for asthma, any cancer, chronic obstructive pulmonary disease (COPD), chronic kidney disease, diabetes mellitus, hypertension, liver disease, vascular disease, and alcohol use disorder (definitions provided in S1 Table). Presence of conditions was determined by 1 inpatient or 2 outpatient diagnoses in the 2 years prior to baseline, except for cancer, which was considered present if diagnosed ever prior to baseline. Diagnoses made in the 7 days prior to baseline were not included. We used a validated algorithm to capture smoking status [18] and alcohol consumption [19]. We collected pharmacy fills for angiotensin converting enzyme (ACE) inhibitors and angiotensin II receptor blockers (ARBs) and identified individuals with active prescriptions in the 30 days prior to baseline. Missing data for smoking and alcohol consumption affected only 5% of individuals included in multivariable models; thus, complete case analysis was performed.
We also created variables to assess potential variation in racial/ethnic disparities by calendar time, region, and outbreak pattern. We split the population into 3 groups based on date of COVID-19 test: February 8 to April 21, April 22 to June 21, and June 22 to July 22. States were grouped into 4 US Census regions (i.e., West, South, Midwest, and Northeast) [20]. Outbreak patterns were based on site-level percentage of positive tests per month among sites with at least 100 positive COVID-19 tests: early (�10% in March or April), late (�10% in June or July), resurgent (�10% in March or April and June or July), steady (<10% in all months), and other (sites with <100 positive tests).

Statistical analysis
We calculated COVID-19 testing rate per 1,000 individuals in care and Clopper-Pearson 95% confidence intervals (CIs) by race/ethnicity category. Among those tested for COVID-19, we calculated percent testing positive and 95% CIs by race/ethnicity category. Logistic regression models were used to estimate associations between race/ethnicity and COVID-19 positivity and mortality, adjusting for sets of potential mediators of such disparities, moving from more distal to more proximate determinants of health. Age-adjusted models included race/ethnicity and age. Demographic-adjusted models additionally included sex and rural/urban residence, and were conditioned on site of care. Fully adjusted models additionally included all clinical covariates, substance use, and medication history. We report the estimates of each individual adjustment as well as those arising from a fully adjusted model. We repeated this modeling strategy to estimate odds ratios (ORs) and 95% CIs between race/ethnicity and 30-day mortality among those who tested positive for COVID-19 on or prior to June 21, 2020, to allow all individuals 30 days of follow-up. We evaluated variation in racial/ethnic disparities in testing positive for COVID-19 by stratifying the fully adjusted model by calendar time, geographic region, and site-level outbreak pattern. In sensitivity analyses, we restricted ascertainment of 30-day mortality to only include inpatient deaths to test the robustness of the associations found in the primary models. Analyses were performed using SAS version 9.4 (SAS Institute, Cary, NC, US). R version 3.6.3 was used to map COVID-19 cases nationwide.

Results
There were 5,834,543 individuals in care prior to the COVID-19 pandemic. Most (91%) were men, 74% were White, 19% were Black, and 7% were Hispanic (Table 1). Age distributions were similar by race/ethnicity, and age ranged from 20 to 105 years, with 24% less than 50 years, 35% 50-69 years, 28% 70-79 years, and 13% �70 years of age. Of these, 254,595 (43.6 per 1,000 individuals in care) were tested for COVID-19, of whom 65% were White, 26% were Black, and 9% were Hispanic. There were 16,317 (6.4%) individuals who tested positive for COVID-19 between February 8 and July 22, 2020, of whom 44% were White, 40% were Black, and 16% were Hispanic. While 66% of all individuals in care resided in urban areas, 76% of those tested and 87% of those testing positive for COVID-19 resided in urban areas. The geographic distribution of COVID-19 cases in the VA was similar to the pattern of known hotspots in the general population, including in the Northeast, South, and some Midwestern states (Fig 1). Several VA sites with the highest proportion of positive COVID-19 tests also performed a higher volume of tests and had the highest proportion of Black individuals in care, including New York City, New Orleans, and Chicago (Fig 1).  and somewhat attenuated among Hispanic individuals (OR 2.56, 95% CI 2.44-2.69, p < 0.001) ( Table 3). These associations further attenuated after additionally accounting for sex, rural/ urban residence, and site of care among Black (OR 1.92, 95% CI 1.85-2.00, p < 0.001) and Hispanic individuals (OR 1.96, 95% CI 1.86-2.07, p < 0.001). These estimates were robust to any individual (S2 Table) or combined adjustment for comorbidities, substance use, and medication history. In fully adjusted models, Black (OR 1.93, 95% CI 1.85-2.01, p < 0.001) and Hispanic (OR 1.84, 95% CI 1.74-1.94, p < 0.001) individuals remained at increased odds of testing positive for COVID-19 (Fig 2).

Discussion
This study examined racial and ethnic disparities in testing and subsequent COVID-19 mortality among approximately 6 million individuals receiving care in the US. We found that Black and Hispanic individuals were more likely to be tested and to test positive for COVID-19 than White individuals, even after comprehensive adjustment for underlying health conditions, other demographics, and geographic location. Among the variables assessed in this study, age, rural/urban residence, and site of care explained more of the racial/ethnic disparity in testing positive for COVID-19 than comorbidities, substance use, or medication history. While the disparity between Black and White individuals decreased over time, the disparity was strongest in the Midwest and at VA sites that experienced an early or resurgent outbreak. There was no variation observed in the disparity between Hispanic and White individuals by calendar time, region, or outbreak pattern. While individuals from minority backgrounds appeared to experience excess burden of COVID-19, among those infected, there was no observed difference in 30-day mortality by race/ethnicity group. The apparent racial/ethnic disparity in mortality in unadjusted data was principally explained by differing age structures between the populations.

Key strengths and limitations
This study elucidated racial and ethnic disparities in testing patterns of COVID-19 independent of underlying health status and other key factors in a nationwide sample. Strengths of this study included that it was based on well-annotated electronic health record data from a team with decades of experience using VA data, enabling a rapid and reliable analysis of COVID-19 outcomes by race and ethnicity. This analysis utilized patients' records from an entire healthcare system, which made it less prone to collider bias (i.e., non-random selection of individuals into a study) than other COVID-19 studies limited to individuals testing positive or admitted to hospital [21]. Unlike other nationwide healthcare systems, linkage to COVID-19 testing data or outcomes was not required as the integrated nature of VA healthcare provided at over 1,200 sites allows all information to be stored in its Corporate Data Warehouse. We used validated algorithms to accurately extract information on and adjust models for a wide range of clinical, behavioral, and geographic factors, with very little missingness in the data. The scale of VA data also allowed us to assess the impact of COVID-19 separately across multiple racial and ethnic minority groups; combining or limiting analyses to a single minority group would have masked important differences between Black and Hispanic individuals. We continue to monitor COVID-19 outcomes for individuals of other minority backgrounds and plan to follow up these analyses when there are sufficient numbers for analysis. While this analysis adds information, its limitations must be kept in mind. First, this study was conducted on veterans currently receiving care in the VA, who are older and have a higher prevalence of chronic health conditions and risk behaviors than the general US population [22][23][24]. However, prior research has established that after adjusting for age, sex, race/ethnicity, region, and rural/urban residence, all of which were included in this study, there is no difference in total disease burden between veterans and non-veterans [24]. Our key finding of no observed racial disparity of COVID-19 mortality has also been shown in a smaller non-veteran population [9]; thus, associations reported in this study are likely generalizable to the wider US population. Second, while individuals in VA care represent a diversity of backgrounds, women represented a small proportion of individuals in the sample. Thus, our analysis was not powered to assess interactions between sex and race/ethnicity. Third, beyond adjusting for rural/ urban location and site of testing, we were not able to explore likely social determinants of the pronounced differential burden of COVID-19 among minority individuals. More detailed information on nursing home residence and socioeconomic status (e.g., type of employment, income, number of individuals in household) were unavailable or not consistently recorded in VA data, as is the case in most other electronic health record data sources. Fourth, as is true outside the VA, only a small proportion of individuals have been tested (~5%), and rates of testing vary by site and within important subgroups. However, while initial testing was limited, by mid-April the VA began testing all individuals admitted to hospital and before any inpatient or outpatient procedures, even in those not suspected to have COVID-19. Our models for testing positive should be cautiously interpreted as a proxy of odds of infection since those with mild symptoms were unlikely to have received testing, particularly in the early stages of the outbreak.

Findings in context
Our findings of racial and ethnic disparities in COVID-19 provide important distinctions from previous reports in the US and other countries with ethnically diverse populations. To our knowledge, one of the largest studies to date on racial disparities in COVID-19 outcomes in the US followed 3,481 COVID-19 cases in the state of Louisiana and found that non-Hispanic Black individuals represented 77% of hospitalizations and 71% of deaths despite only making up 31% of the total source population [9]. However, this study was based on patients who tested positive for COVID-19 in a statewide healthcare system and was underpowered to investigate ethnic minorities. We were able to expand the scope of this finding nationally and to include Hispanic individuals. In the UK, which was the first country with a broadly ethnically diverse population to experience a COVID-19 outbreak [25], a study of 17 million individuals showed that those from minority backgrounds had a substantially higher risk of mortality from COVID-19, which was not fully explained by underlying health conditions or social deprivation [26]. While our study also found racial and ethnic disparities, we found that these disparities occurred primarily at a stage prior to hospitalization (i.e., testing positive for COVID -19). We found no evidence of racial or ethnic disparities in 30-day mortality once models were restricted to those who tested positive for COVID-19. Our findings may be an underestimate of the US population risk as health disparities in the VA tend to be smaller than in the private sector [27]. Nevertheless, at a population level the substantial excess burden of SARS-CoV-2 infection among Black and Hispanic individuals inevitably translates to excess COVID-19 mortality in these communities.
We demonstrated that Black and Hispanic individuals were more likely to test positive than their White counterparts even after accounting for underlying health conditions, other demographics, rural/urban residence, and site of care. Based on experience with the 1918 Spanish flu and the 2009 H1N1 epidemic, public health experts have warned that racial and ethnic minority populations may be at higher risk during infectious disease outbreaks due to underlying health conditions, lower access to care, and socioeconomic conditions [28,29]. Notably, our analysis found that underlying health conditions did not explain any of the disparity between racial/ethnic groups in the odds of testing positive for COVID-19 or subsequent mortality in models already accounting for demographics, principally age, rural/urban residence, and VA site of care-essential information to help guide effective interventions. Prior reports have also highlighted that members of racial and ethnic minorities are more likely to live in densely populated areas or multigenerational households, and minority groups are overrepresented in jails, prisons, and detention centers, all of which lead to reduced capacity to implement physical distancing [30][31][32][33][34]. Similarly, Black and Hispanic workers are more likely than their White counterparts to be workers in essential industries, who continue to work outside the home despite outbreaks in their communities, making them more prone to exposure and therefore infection [34][35][36].
We found substantial variation in the disparity between Black and White individuals in testing positive for COVID-19 by geographic region, with stronger disparity observed in the Midwest than all other regions, and disparity most attenuated in the West. Further breakdown of groups within the Black community (e.g., African American, Afro-Caribbean, African), which could potentially reveal additional variation, is not captured in VA data. The observed disparities may be due to differential social determinants of health between Black and White individuals across regions. A US Census Bureau report showed that while racial residential segregation has diminished over time nationally, communities in the Midwest remained less integrated than in the West [37]. If community-level exposure is driving risk of SARS-CoV-2 infections, then the disparity in testing positive for COVID-19 may be lower in regions with greater integration between White and Black residents, as is the case in the West. We also found that the disparity between Black and White individuals in testing positive slightly decreased over the study period and was highest at VA sites that experienced an early outbreak of COVID-19. This finding may be partially explained by the increased attention on racial disparities in COVID-19 in the media [3][4][5] that may have impacted behaviors like wearing face coverings in public to reduce the spread of infection [38].
Interestingly, the ethnic disparity between Hispanic and White individuals in testing positive for COVID-19 was consistent across time, geographic region, and outbreak pattern; the disparity was consistently observed across all strata. The lack of variation over time may be explained, in part, by less nationwide media coverage and epidemiological investigations of outbreaks of COVID-19 in Hispanic communities. Importantly, the Hispanic population in the US comprises a wide array of ethnic communities (e.g., Mexican, Puerto Rican, Cuban). However, these distinctions are not captured in VA data. The umbrella grouping may mask any potential variation within such a heterogeneous population. Further research on the impact of COVID-19 in Hispanic and Latinx communities is urgently needed.
Testing rates for COVID-19 in the VA were higher among Black and Hispanic individuals compared to White individuals. Local reporting from metropolitan areas with large minority populations, including New York City [39] and Chicago [40], has highlighted the disproportionate impact of COVID-19 in minority communities. We showed that VA facilities in these cities and others around the country that conducted the highest number of COVID-19 tests also had the highest proportion of Black individuals in care. There were also differences in the rate of COVID-19 testing and the proportion testing positive by age, sex, and type of residence across race/ethnicity groups. These findings demonstrate the need for epidemiological investigations to characterize testing patterns in the underlying population as they provide important context for interpretations of models of COVID-19 outcomes. To our knowledge, the largest medical record study to date analyzed COVID-19-related mortality in a population of 17 million residents in the UK, the vast majority of whom never tested for COVID-19 [26]. While the authors identified ethnic disparities in COVID-19-related mortality, the estimates reported can be interpreted as the overall burden of mortality by ethnicity without accounting for underlying testing patterns. We found a similar disparity in the overall burden of COVID-19-related death in the full source population of approximately 6 million individuals in care at the VA. However, when the model was restricted to individuals testing positive-which inherently accounts for factors related to access to testing, non-random testing, and odds of infection-racial and ethnic disparities in mortality were no longer observed.

Policy implications
These findings underscore the urgent need to proactively tailor strategies to contain and prevent further outbreaks in the US, principally focused on testing and getting individuals into care. Black and Hispanic communities are at increased risk of infection, justifying increased intensity of intervention. Our findings of variation in disparities over time and across geographic regions highlight the important need for community-based interventions at a state and local level to contain further exposure and outbreaks of COVID-19, particularly tailored to minority communities. Other interventions may include clinical decision support tools to prompt educational and testing interventions based upon an individualized risk assessment of testing positive for COVID- 19. Outreach products about COVID-19 testing and disparities should also be distributed to patient advocates and groups at all points of care.

Future research
We appeal to other researchers investigating racial and ethnic disparities to perform analyses on the entire population at risk for COVID-19 where data are available, and to compare findings associated at each stage in the clinical course of COVID-19, from testing to outcomes. In this paper, we focused only on 30-day mortality among COVID-19 cases. We plan to explore other outcomes, including hospitalization, intensive care, and intubation, in subsequent analyses to examine whether racial and ethnic disparities exist in the clinical course of COVID-19 after testing positive and before death. Among other factors, future research should consider the role of other social determinants of health, including employment type, number of individuals in household, nursing home residence, and incarceration. Other racial and ethnic minorities in the US deserve attention, and while we did not have enough statistical power to include other groups in this analysis, we will continue to monitor these numbers for future research.  Table. Individual adjustments for the association between race/ethnicity and COVID-19 positivity and mortality. (DOCX)