Interpreting COVID-19 deaths among nursing home residents in the US: The changing role of facility quality over time

A report published last year by the Centers for Medicare & Medicaid Services (CMS) highlighted that COVID-19 case counts are more likely to be high in lower quality nursing homes than in higher quality ones. Since then, multiple studies have examined this association with a handful also exploring the role of facility quality in explaining resident deaths from the virus. Despite this wide interest, no previous study has investigated how the relation between quality and COVID-19 mortality among nursing home residents may have changed, if at all, over the progression of the pandemic. This understanding is indeed lacking given that prior studies are either cross-sectional or are analyses limited to one specific state or region of the country. To address this gap, we analyzed changes in nursing home resident deaths across the US between June 1, 2020 and January 31, 2021 (n = 12,415 nursing homes X 8 months) using both descriptive and multivariable statistics. We merged publicly available data from multiple federal agencies with mortality rate (per 100,000 residents) as the outcome and CMS 5-star quality rating as the primary explanatory variable of interest. Covariates, based on the prior literature, consisted of both facility- and community-level characteristics. Findings from our secondary analysis provide robust evidence of the association between nursing home quality and resident deaths due to the virus diminishing over time. In connection, we discuss plausible reasons, especially duration of staff shortages, that over time might have played a critical role in driving the quality-mortality convergence across nursing homes in the US.


Introduction
About a year into the COVID-19 pandemic, cases and deaths among nursing homes residents in the US show a downward trend [1] but continue to remain high [2]. In February 2021, one out of every hundred nursing home residents died from this infection, a rate that was half the COVID-19 mortality rate (about 2 per 100 residents) in the previous month of January but double the rate (about 0.5 per 100 residents) in September 2020 [2]. Previous findings, among other things, have highlighted lower quality of care in nursing homes to be a thorny issue impacting risk of death among facility residents [3,4]. In synergy, multiple recent studies and reports, including one from the Centers for Medicare & Medicaid Services (CMS) [5], reported CMS five-star quality rating to be a significant determinant of cases and deaths among residents [6,7]. But, as the pandemic progressed, the question that remained unaddressed is whether nursing home quality mattered more or less over time. Evidence in this regard is lacking with no prior studies examining how the role played by this factor may (or may not) have changed across the US over time. Such an interpretation, as indicated by our findings, may be essential for nursing home administrators to understand the challenging role of managing resources, specifically workforce over time during prolonged public health emergencies. To address the above mentioned gap, we were interested in examining the evolution, if any, in the association between nursing home quality and resident deaths. While prior analyses are mostly focused on interpreting COVID-19 cases and outbreaks across nursing homes, a smaller set reported findings explaining variations in deaths with a few [7][8][9] linking resident deaths to nursing home quality. He et al. [7] examined data for April-June 2020 on skilled nursing facilities in California and found that nursing homes with 5-star ratings were less likely to have COVID-19 resident deaths after adjusting for facility (size and ownership) and resident (racial: percent white) characteristics. In contrast, three studies reported findings of no relation between quality and COVID-19 deaths in nursing homes. Additionally, for the March-November 2020 period, evidence of a decline in mortality risk of nursing home residents adjusted for patient characteristics is forwarded in Kosar et al. [10] but factors, either facilityor community-level, behind this decrease are not explored. The generalizability of findings in these studies is however limited with data analyzed for a single state [7][8][9]11], or from a single multifacility provider with facilities mostly in the Northeast [10], or using descriptive statistics only [9].
In contrast, we conduct statistical analysis using both descriptive and multivariable methods to examine resident deaths from the virus across nursing homes in the US between June 1-2020 and January 31-2021 (n = 12,415 nursing homes X 8 months). We hypothesized that, other things remaining equal, the strength of the relation (association) between nursing home quality and resident COVID-19 deaths will diminish over time. Such a scenario is plausible if, after the initial shock, nursing homes across the board were able to adjust to the pandemic with knowledge (for example, asymptomatic transmission of virus) and best practices accumulating over time. Alternatively, with expanding contagion, challenges of higher quality nursing homes may have started to resemble that experienced by facilities with lower star ratings. If so, any gap in COVID-19 mortality outcomes that may have differentiated higher from lower quality nursing homes may have decreased over time.

Data
We used the publicly available Nursing Home COVID-19 [5] data released on May-26, 2020 and updated weekly by the CMS. After download on March-1, 2021, we retained data for the period between June-1, 2020 and January-31, 2021 and for nursing homes: i) in all states except Guam and Puerto Rico, ii) that passed the CMS quality assurance check, iii) had no missing occupancy information, and iv) reported data for all weeks in our study period. Following this initial data clean, we created facility-level monthly data (n = 12,473 facilities X 8 months) by aggregating the weekly CMS data. We combined this dataset with Nursing Home preparation of the manuscript. Neither did FMG provide any financial support in the form of the coauthor's salary and/or research materials. The coauthor affiliated with the FMG undertook work for this study in the role of an independent researcher working specifically and only during weekend hours outside of his employment with FMG.
Competing interests: One of the study coauthors (SCT) is affiliated with the Fors MarshGroup (FMG), Arlington, VA. However, FMG did not play any role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. Neither did FMG provide any financial support in the form of the coauthor's salary and/or research materials. The coauthor affiliated with the FMG undertook work for this study in the role of an independent researcher working specifically and only during weekend hours outside of his employment with FMG. We also confirm that the study coauthor's commercial affiliation with FMG does not alter our adherence to all PLOS ONE policies on sharing data and materials.
Compare [5] data and LTCFocus [12] data that respectively include information on facility (such as, 5-star quality rating) and resident characteristics (such as %White residents, %Medicaid). We then merged this facility-month level master dataset with community-level data on urbanity [13], social vulnerability [14], COVID-19 deaths, population [15], and core-based statistical area (CBSA) codes [16]. The final sample size (n = 99,320), after deletion of nonmatched observations from merging of community-level data, consisted of 12,415 facilities over an 8-month period. The unit of observation of this dataset was facility-month. We selected June-1, 2020 as the start of our study period partly to capture information on the 8 full months since data release by the CMS [5] on May 26, 2020. But more importantly, this start date matched with the period for which data reliable for longitudinal analysis was available from the CMS [5].

Outcome and primary explanatory variable
The progression of the pandemic across communities varied over time. Hence, for the purpose of comparability, the outcome variable in our descriptive analysis is a rate ratio [17] representing nursing home resident mortality rate relative to CBSA mortality rate [18,19]. The formula for computing this variable is provided in S1 Appendix. In contrast, in our multivariable regression analysis, the outcome variable is nursing home resident COVID-19 death rate (per 100,000 residents) given that this method allows the inclusion of control variables including community conditions related to the pandemic.
The explanatory variable of interest is CMS's overall 5-star quality rating of nursing homes [20,21]. Beginning in December 2008, each nursing home participating in Medicare or Medicaid started receiving an overall quality rating with values ranging between 1 (= lowest performing) and 5 (= highest performing) stars from the CMS. CMS creates the 5-star Quality Rating System to guide consumers, their families, and caregivers when comparing nursing homes based on their performance outcomes. The CMS overall star rating that we used in our study is constructed based on extensive areas of assessment ranging from nursing home environment to patient care. Specifically, CMS computes the overall star rating from nursing home performance outcomes in the three areas of: i) health inspections, ii) staffing (staffing ratios in conjunction with patient needs), and iii) other quality measures (comprising of 15 different physical and clinical measures) [20,21]. Each of these three domains are weighted differently in the computation of the overall star rating, with health inspections rating weighing heavily and based on findings from the most recent 3-year state health surveys [20,21]. To avoid the possibility that measured nursing home quality is endogenous to COVID-19 outbreak, we used the last available (December 2019) pre-pandemic 5-star rating that enabled us to evaluate how pre-COVID nursing home quality predicted resident COVID-19 mortality outcomes over time.

Covariates
Prior studies reported significant variations in resident COVID-19 deaths across facility ownership status [7], facility size [7], staffing [11], and resident racial and socioeconomic composition [7,11,22,23]. In our multivariable analysis, we therefore controlled for these facility-level characteristics that included: i) ownership (nominal: government = 0 | not-for-profit = 1 | forprofit = 2), ii) size (ordinal: measured using total number of beds; <50 beds = 0 | 50-99 beds = 1 | 100-199 beds = 2 | 200 beds or more = 3), iii) duration of staff shortage (continuous: measured as the number of weeks in a month that shortages in care personnel comprising of nursing, clinical, aides, and other staffs are reported by a facility), and resident racial and socioeconomic makeup (continuous: measured as % White and % Medicaid residents in a facility).
We included three additional facility-level variables which were: personal protective equipment (PPE) supply shortages (continuous: measured as the number of weeks in a month for which shortages in any PPE were reported by a facility), resident confirmed COVID-19 cases, and resident death rate (per 100,000 residents) from all causes excluding COVID-19 as an acuity measure for resident health. Given the evidence that acuity of nursing home residents has been increasing in recent years [24,25], we opted to use resident death rate from all causes excluding COVID-19 as the proxy indicator for current resident acuity during our study period. We therefore did not use the acuity (case mix) measure variable from the LTCFocus [12] for which data was collected in 2017.
Lastly, we included urbanity (nominal for location: nonmetro = 0 | metro = 1) [13], social vulnerability index (SVI) (continuous: ranging between 0 = lower and 1 = higher vulnerability) [14], and CBSA COVID-19 case and death rates (per 1000,000 population) due to the virus [15] as community-level controls. The SVI is published by the Centers for Disease Control and Prevention [14] and is computed for each county in the US based on 15 social factors, including unemployment, minority status, and disability. We used CBSA codes [16] to identify counties, aggregate corresponding COVID-19 death and population values, and compute death rates for each CBSA. CBSA codes were explicitly defined by the US Office of Management and Budget (OMB) [16] to identify statistical core areas. Each core comprises of a population center closely linked to an adjoining group of counties based on commuting patterns. Consequently, the geographic extent of travel and, thereby infection spread, by workforce and visitors to and from a nursing home are more likely to be encompassed within a CBSA, as opposed to the facility's county which is a subset of its CBSA. CBSA codes have been used in prior studies examining infectious disease spread [18] as well as to summarize community COVID-19 profiles [19]. For non-metro nursing homes, we used the death rates of the county.

Statistical analysis
We examined nursing home to CBSA mortality rate ratio using descriptive statistics (median, interquartile range, and trend lines). Our primary analysis involves repeated comparison of nursing homes with different levels of initial quality over multiple months. To account for potential confounders, we apply the multivariable regression methodology. Since the outcome (nursing home resident COVID-19 deaths per 100,000 residents) in this analysis has a distribution similar to a count variable, to account for influential observations that result from the skewed distribution of this dependent variable we first estimated the Poisson specification. To assess the robustness of our baseline results, we also estimated nonzero-inflated (negative binomial) and zero-inflated (zero-inflated Poisson and zero-inflated negative binomial) count model specifications.
The theoretical basis of the zero inflated models is that the data comprises of excess zero outcomes representing, in this case, facilities unexposed and, in turn, not at risk of a death from the disease [26]. Alternatively, zero outcomes of no deaths may also result in the presence of the virus representing true zeros. The zero inflated regression specifications are therefore a two-part model. The logistic part estimates the contribution of factors in explaining whether or not a facility was at risk of death. But, given the risk of a death, the second part (count part) predicts mortality counts for facilities that are not excess zeros. Additional detailed description of each of the count model specifications, including zero-inflated models, are provided in Karaszia et al. [27] and Hardin and Hilbe [28].
For each of the multivariable regression models, we clustered observations by CBSA since the cluster option yields robust standard errors adjusted for nonindependence of mortality outcomes over time and within each CBSA. We performed all statistical analysis using Stata (version 15) and report results from the count model providing the best fit to the data. We interpret statistically significant findings at p � 0.05 (unless specified otherwise).

Descriptive analysis
In Table 1, we provide the descriptive statistics on nursing home to CBSA COVID-19 mortality rate ratio in our sample of nursing homes. As is evident from Table 1, all eight months are dominated by nursing homes with a zero rate ratio. For nursing homes with a positive rate ratio, the median over the study period was 190 with all months indicating a highly skewed distribution in the positive direction. The IQR fluctuates but decreases over the later months indicating a less dispersed distribution. A similar pattern of convergence is evident in Fig 1 revealing that the differences in mortality outcomes by nursing home quality ratings reduced over time. Fig 1  also indicates that the nursing home quality-mortality relation may have been different between the zero versus the nonzero mortality outcome nursing homes. Findings from our multivariable analysis further confirmed this difference in mortality trend which we discuss next.

Multivariable regression analysis
We pooled the monthly data and examined changes in relations over time by interacting the main independent variable, quality rating, as well as covariates with time (month). In Tables 2  and 3, we report the association between nursing home mortality rate (per 100,000 residents) and overall quality rating. While the likelihood ratio test [29] confirmed negative binomial as more appropriate than the Poisson specification, the Vuong test [30,31] identified the zero inflated negative binomial model to perform better than the standard negative binomial one. Furthermore, the Akaike's information criteria (AIC) and Bayesian information criteria (BIC) [28] values were the smallest for the zero-inflated negative binomial specification indicating this model to be the best fit compared to the other count models we tested. We therefore report and interpret findings from the zero inflated negative binomial model only.
Findings from the descriptive analysis (Table 1) suggested an excess of zero mortality outcomes in our sample of nursing homes. The zero inflated negative binomial model (full model in Table 3) indicates that the log odds of being an excessive zero would increase by 0.06 (or 6%; 95% CI: 0.023, 0.105) for every unit increase in quality rating holding other variables constant. However, this relation diminished over time as demonstrated by the negative interaction term (-0.013, 95% CI: -0.020, -0.006) which was statistically significant. Furthermore, the quality-mortality rate association in the count part revealed a negative association between the two   Note: 1 Parsimonious models to examine unadjusted role of nursing home quality (after accounting for the level/progression of the pandemic in the community). 2 The full models included all covariates we presented earlier under the Methods section. 3 Abbreviations CI: Confidence Interval (95%) | SE: Standard Error (robust). 4 � p < 0.1, �� p < 0.05, ��� p < 0.01. 5 Data extraction and statistical analysis were conducted by the study authors. 6 Missing values: < 2% in each month. 7 Poisson versus negative binomial. 8 Zero inflated negative binomial versus standard negative binomial.
https://doi.org/10.1371/journal.pone.0256767.t002 variables which also decreased over time (positive interaction term). However, both these latter relations fail to attain statistical significance. Among the statistically significant covariates (shown in S1 Table), a longer duration of staff shortages and larger size of facilities decreased the log odds of being an excess zero outcome. In the count part, while longer duration of staff shortages indicated higher mortality rates, larger size of facilities was associated with lower mortality rates. Additionally, the staff-mortality relation did not change over time, but the association between facility size and mortality rate diminished over time. Finally, a higher proportion of White residents in a nursing home increased the likelihood of being an excess zero outcome, an association that also diminished over time. Among the community-level covariates, higher CBSA mortality rate and metro location decreased the log odds of being an excess zero outcome with both associations diminishing over time. In addition, the higher the mortality rate was in a CBSA, the higher was the mortality rate in a nursing home located in that CBSA and this positive association did not change over time.

Discussion
As witnessed during this pandemic, nursing home quality is not a nebulous concept and its impact can be critical in determining life and death in the real world. Our findings (Table 3- Note: 1 Parsimonious models to examine unadjusted role of nursing home quality (after accounting for the level/progression of the pandemic in the community). 2 The full models included all covariates we presented earlier under the Methods section. 3 Abbreviations CI: Confidence Interval (95%) | SE: Standard Error (robust). 4 � p < 0.1, �� p < 0.05, ��� p < 0.01. 5 Data extraction and statistical analysis were conducted by the study authors. 6 Missing values: < 2% in each month. 7 Poisson versus negative binomial. 8 Zero inflated negative binomial versus standard negative binomial.
https://doi.org/10.1371/journal.pone.0256767.t003 zero inflated negative binomial: main effect) revealed that higher quality of nursing homes increased the possibility of belonging in the excess zero group not at risk of death. A related finding was that this association diminished over time (Table 3-zero inflated negative binomial: interaction term) showing the reducing role of quality as the pandemic progressed. Together these findings suggest that higher quality nursing homes were better prepared to handle the pandemic in the earlier stages of our study period. This trend is a reflection of the infrastructure and rapidly adaptable processes in place in these nursing homes that helped them not only achieve a higher quality rating but also may have provided an advantage in dealing with the health care emergency due to COVID-19. However, with the passage of time, a convergence in mortality between low and high quality nursing homes is indicated by our findings. There are two probable explanations to interpret this trend. Nursing homes with a lower star rating may have implemented best practices as new knowledge and guidelines [32,33] as well as resources [24,34] to manage the virus became available over time. For instance, to combat the virus, overhaul in care delivery and practices of nursing homes have been reported [24,35,36]. This transition toward best practices may have been incentivized by performance-based metrics rewarding reduction in deaths under the Nursing Home Quality Incentive Payment Program beginning in September, 2020 [24,34]. Consequently, nursing homes with a lower star rating over time may have been able to match their higher quality counterparts in tackling the pandemic. The alternative explanation could be that the structure and processes built by high quality nursing homes was slowly eroded as the pandemic continued to strain human resources [37,38] and all nursing homes started to resemble one another in their performance. Specifically, increasingly worsening staff shortages and attrition may have been a factor of particular relevance here [24].
The interaction between nursing home quality and staff shortages was not the focus of our multivariable analysis, but we found descriptive evidence in this regard. We found that chronic staff shortages, although lower, rose faster for higher quality nursing homes (S1 Fig). Staff shortages in high-quality nursing homes might have been more acute on account of the employment of part-time workers by for-profit facilities. In our sample, a majority (58%) of the for-profit nursing homes were higher quality with a star rating of 3 or above. Recent reports are suggestive of for-profit facilities relying on part time workers [39] and struggling to retain their staff over time [40]. Additionally, an increasingly higher intake of COVID-19 patients by for-profit nursing homes without commensurate increase in staffing especially those trained in specialty services [41] may have put further stress resulting in higher staff turnovers.
Findings from our multivariable analysis are also in support of the central but a negative role of staff shortages, other things remaining equal. Duration of staff shortages were significantly associated with mortality rates both in the logit and count parts of the zero inflated negative binomial model. Our results also indicated that the adverse role of staff shortages did not change over time but continued to be an important factor. Throughout our study period, staff shortages (S1 Fig) self-reported by nursing homes persisted at higher levels which has received wide reporting [42] and analysis [43,44]. An additional factor driving this trend may have been rising demand for nursing services in hospitals. As the pandemic worsened, hospitals turned to expensive staffing agencies to meet their demand [43,45,46] thereby generating lucrative travel nursing job opportunities for nursing home nurses. This possibly may have shifted staff from nursing homes to hospitals and further accentuated shortages and worsened the burn-out of staff who remained.
There is longstanding evidence [42,[47][48][49] of nursing home staff being poorly paid, employed without benefits (such as sick leave or health insurance), and many working multiple jobs [50]. Early evidence indicates innovative approaches adopted in other developed countries, such as requiring nursing home staff to choose a single work location with compensation for the resulting income loss [51] or, offering hazard pay and surge staffing/recruitment were associated with [51,52] lower cases and deaths from the virus. Similar such policies addressing staff pay would become feasible in the US if nursing homes remain financially stable and healthy, especially given increasing costs but declining revenues across nursing homes due to the pandemic [42,53]. In connection, the role played by Medicaid reimbursement becomes salient as the major source of funding (about 60%) for nursing homes which does not cover about 20-30% of the actual cost of care [53].
Of the other facility-level factors, ownership and shortage of PPE were not associated with mortality in either the early part of our study period or in the later stages as the pandemic progressed. While supply of PPE has been identified in one study [44] as a factor driving staff shortages in nursing homes, the association between PPE shortage and resident deaths may have been complicated by lack of staff training regarding use/reuse of PPE [54]. But, the unequal burden of the disease borne by the non-White population in the early part of the pandemic was reflected by our finding that higher the proportion of White residents in a facility, the higher the likelihood of being an excess zero nursing home. Additionally, our analysis revealed an interestingly complex relation between facility size and mortality rate. Larger the facility size, lower was the death rate. While this finding is compatible with prior results [7,55,56], we found that this inverse association between facility size and death rate diminished over time. In addition, our results suggested that larger facility size lowered the possibility of being an excess zero with this association also diminishing over time. The exact mechanisms driving these seemingly contradictory trends warrant further research, possibly regarding how micro-factors [57,58] as single-versus multiple-occupancy rooms, designated isolation wards, or physical size and configuration of rooms, and their corresponding roles may have varied by facility size.
Finally, community-level external variables (CBSA death rate and metro location) were significant determinants of nursing home mortality outcomes. However, the social vulnerability of the county that a nursing home was located in, did not show significant association with COVID-19 deaths in the nursing home. Together these findings suggest that the progression of the pandemic, as opposed to prior social conditions, in the community was a salient determinant of nursing home mortality outcomes.

Limitations
In our analysis, we were interested in the initial differences in death rates between high and low quality nursing homes and their change over time. Since fixed effects would be collinear with initial nursing homing quality, we conducted a pooled analysis after interacting all explanatory variables with time (month) to examine change over our study period of 8 months spanning June 2020-January 2021. To account for the non-independence of observations over time (as a result of repeated measures of death rate for a given nursing home) and space, we estimated robust standard errors clustered on CBSA. After adjusting for data quality and incompatibility across multiple sources, we could analyze a subset (n = 12,415 | about 81%) of the nursing homes across the US. We acknowledge that our findings are therefore representative of this sample of nursing homes only. Moreover, we were unable to include data from the early months of the pandemic as weekly CMS COVID-19 data reliable for longitudinal analysis became available only in the last week of May [5].

Conclusion
Lower quality of care in nursing homes significantly affects the risk of death among facility residents. However, recent evidence on the relation between nursing home quality and resident COVID-19 deaths remains mixed with findings not generalizable as most of these analyses are single state and/or are descriptive investigations. Furthermore, no prior analysis have examined how the role of quality may (or may not) have changed over time. Our findings contribute to this literature by providing robust evidence of the association between nursing home quality and resident deaths due to the virus over time across the US. We hypothesized that, other things remaining equal, the strength of the relation (association) between nursing home quality and resident COVID-19 deaths will diminish over time. Consistent with this hypothesis, we found that quality rating was a significant factor predicting whether a nursing home was at risk of resident COVID-19 death but this relation diminished over time. In connection, we highlighted the critical role of staff shortages and related policy considerations.
Supporting information S1 Appendix. Formula for mortality rate ratio. (DOCX) S1 Table. Regression results from the two-part zero inflated negative binomial regression (full model).