Researching COVID to enhance recovery (RECOVER) tissue pathology study protocol: Rationale, objectives, and design

Importance SARS-CoV-2 infection can result in ongoing, relapsing, or new symptoms or organ dysfunction after the acute phase of infection, termed Post-Acute Sequelae of SARS-CoV-2 (PASC), or long COVID. The characteristics, prevalence, trajectory and mechanisms of PASC are poorly understood. The objectives of the Researching COVID to Enhance Recovery (RECOVER) tissue pathology study (RECOVER-Pathology) are to: (1) characterize prevalence and types of organ injury/disease and pathology occurring with PASC; (2) characterize the association of pathologic findings with clinical and other characteristics; (3) define the pathophysiology and mechanisms of PASC, and possible mediation via viral persistence; and (4) establish a post-mortem tissue biobank and post-mortem brain imaging biorepository. Methods RECOVER-Pathology is a cross-sectional study of decedents dying at least 15 days following initial SARS-CoV-2 infection. Eligible decedents must meet WHO criteria for suspected, probable, or confirmed infection and must be aged 18 years or more at the time of death. Enrollment occurs at 7 sites in four U.S. states and Washington, DC. Comprehensive autopsies are conducted according to a standardized protocol within 24 hours of death; tissue samples are sent to the PASC Biorepository for later analyses. Data on clinical history are collected from the medical records and/or next of kin. The primary study outcomes include an array of pathologic features organized by organ system. Causal inference methods will be employed to investigate associations between risk factors and pathologic outcomes. Discussion RECOVER-Pathology is the largest autopsy study addressing PASC among US adults. Results of this study are intended to elucidate mechanisms of organ injury and disease and enhance our understanding of the pathophysiology of PASC.

Aim 4: Establish a multi-center post-mortem tissue biobank and a post-mortem brain magnetic resonance imaging (MRI) bank from decedents with prior SARS-CoV-2 infection with and without PASC.The primary goal of the brain imaging is to find and characterize lesions in the central nervous system (CNS) via image-directed sampling of identified lesions.A secondary goal is to evaluate a small subset of decedents with pre-infection and post-mortem brain MRIs to examine whether decedent's PASC status is associated with the presence of new lesion(s) in the CNS.

Objectives of the Statistical Analysis Plan
This statistical analysis plan (SAP) outlines the statistical methods to be used in the primary analyses of RECOVER Pathology Study of PASC data to address the study objective(s) indicated in the protocol.Populations for analysis, data handling rules, and statistical methods are provided.The statistical analyses and summary tabulations described in this SAP will provide the basis for the results sections of associated manuscripts, reports, and presentations.We plan to submit the SAP to ClinicalTrials.gov.

Overall study design
The decedents will enter this cross-sectional Tissue Pathology Study at varying stages after their infection with SARS-CoV-2 (see Figure 1).This study will be conducted in the United States and decedents will be enrolled through interaction with families of the deceased at inpatient, outpatient, and community-based settings.
The Tissue Pathology Study will: (i) characterize the pathology of PASC in hospitalized and nonhospitalized patients who died 30 days or later from their SARS-CoV-2 infection, and (ii) explore the pathology of acute SARS-CoV-2 infection in a smaller subset of patients who died 15-30 days from their SARS-CoV-2 infection.
Recall that we will consider the first SARS-CoV-2 infection, if there is some evidence of multiple infections, and hospitalization status for COVID-19 is defined by whether a decedent was ever hospitalized for COVID-19 within 30 days of a SARS-CoV-2 infection.
Study data, including age, sex, race / ethnicity, medical history, vaccination history, details of acute SARS-CoV-2 infection, overall health and physical function, and PASC symptom screen, will be collected from the electronic health record (EHR) using a case report form (CRF).

Study population
The RECOVER Tissue Pathology Study population includes four categories of decedents: 1. Post-acute decedents (death occurred 30 days or later after SARS-CoV-2 infection): a. Non-hospitalized patients, Additional criteria for decedent enrollment are included below.

Decedents with PASC
Decedents meeting WHO criteria for suspected, probable, or confirmed SARS-CoV-2 infection, who die more than 30 days after SARS-CoV-2 infection, and decedents who site investigators believe meet the working definition of PASC, will be recruited.Note that the definition of PASC given in this Section may evolve over the time of the Tissue Pathology Study.
The diagnosis of PASC will be based on having one of three primary symptoms: • fatigue (i.e., being very tired), • shortness of breath, • problems thinking or concentrating (i.e., "brain fog"), or at least two secondary symptoms: • post-exertional malaise (symptoms worse after even minor physical or mental effort), • weakness in arms or legs, • fever, chills, sweats or flushing, • loss of or change in smell or taste, • pain in any part of body, • persistent (chronic) cough, • palpitations, racing heart, arrhythmia, skipped beats, • gastrointestinal (belly) symptoms (feeling full or vomiting after eating, diarrhea, constipation), • nerve problems (tremor, shaking, abnormal movements, numbness, tingling, burning, cannot move part of body, new seizures), • problems with anxiety, depression, stress, or trauma-related symptoms like nightmares or grief, • problems with sleep, • feeling faint, dizzy, "goofy"; difficulty thinking soon after standing up from a sitting or lying position, • vision problems (blurry, light sensitivity, difficulty reading or focusing, floaters, flashing lights, "snow"), • problems with hearing (hearing loss, ringing in ears), • changes to menstrual cycle, during the post-acute phase of COVID-19, but not prior to SARS-CoV-2 infection.
Note that symptoms occurring during the acute phase of COVID-19 or at an unknown time will not be considered for PASC diagnosis.Also note that, as the definition of PASC evolves in light of information obtained from the RECOVER Adult Cohort Study, reviewed at quarterly intervals, we will apply the adjusted definitions to decedents.

Decedents without PASC
Infected acute and post-acute decedents must meet the following conditions at the time of RECOVER autopsy enrollment: 1. Acute decedents: Individuals meeting WHO criteria for suspected, probable, or confirmed SARS-CoV-2 infection who die 15-30 days after their SARS-CoV-2 infection.Note that acute decedents die before developing PASC.2. Post-acute decedents: Individuals meeting WHO criteria for suspected, probable, or confirmed SARS-CoV-2 infection who die without symptoms of PASC more than 30 days after their SARS-CoV-2 infection.Note that the absence of PASC symptoms will be based on the contemporaneous working definition of PASC, which may evolve over time.

Stratified recruitment
Recruitment will occur across sites (see the Tissue Pathology Study Protocol for more details).
Note that the autopsy sites will frequently communicate with RECOVER Clinical Sciences Core (CSC) and Data Resource Core (DRC) to ensure that the targeted recruitment is satisfied, and that the groups of decedents with and without PASC are balanced with respect to background covariates (e.g., age, sex, race / ethnicity).For example, it is anticipated that post-acute decedents with PASC may be older than the ones without PASC.

Sample size determinations
N=700 individuals will be recruited to the RECOVER Tissue Pathology Study, including N=400 postacute decedents with PASC, N=200 post-acute decedents without PASC, and N=100 acute decedents (Table 1).A sample size of 400 within one group (e.g., post-acute decedent with PASC) will have greater than 80% power to detect a small effect size, i.e., Cohen's d=0.28 between individuals with and without the risk factor, assuming 50% of individuals have the risk factor (i.e., 200 with and 200 without the risk factor; row 3 in Table 2).To characterize the detectable effect sizes from the subset of decedents who undergo brain imaging, for example, please see the first row of Table 2.A sample size of 300 provides 80% power to detect a moderate effect size of d=0.32 between individuals with and without the risk factor (within the specified group), assuming 50% of individuals have the risk factor.These calculations are based on a two-group, two-sided comparison of means, and controlling type 1 error at 0.05.

Interim analyses and potential study modifications
The Tissue Pathology Study Protocol is designed to be pragmatic and flexible in design.We will undertake the following procedures to guide protocol modifications over time in consultation with the Observational Study Monitoring Board (OSMB): 1. Data elements that are not NIH recommended Common Data Elements (CDEs) may be modified based on ongoing analysis by DRC. 2. PASC definition will be revised in an iterative manner based on consortium investigator expertise, existing PASC data, medical literature, and feedback from family representatives, and the scientific community.Updated PASC definitions may be used to implement a strategy to modify deeper phenotyping.

Autopsy
Autopsies must be performed within 24 hours of death.Procedures must comply with those set forth in the Tissue Pathology Study Protocol (Appendix 1 -Gross Autopsy Template) and the Manual of Procedures (MOP).These are summarized here.
1. Prior to conducting the autopsy, collect blood from decedent for serum separator tubes and blood cards.Allow the blood cards to dry completely prior to closing and room temperature storage.Cerebrospinal fluid (CSF) will be obtained via a cisternal tap and frozen.2. Apply the appropriate modified College of American Pathologists (CAP) Gross Autopsy protocol (adult), including both gross and microscopic examination, and collect the autopsy common data elements (demographic, external, and internal findings).These data will include structured fields for the gross and certain microscopic findings.The microscopic reports will also be captured as text fields, and manually coded into approved terms and systemized nomenclature of medicine (SNOMED) by pathology coders; these codes will be finalized following review by the responsible pathologist.3. Generate the final anatomic diagnosis (FAD) from a standardized list of approved terms, 4. Collect fresh tissue from 50 anatomic sites representing many organ systems as dictated by the Autopsy Gross Template (see Tissue Pathology Study Protocol, Appendix 1), to be snap frozen (two 1cm cubes) and paraffin embedded (one 1cm cube).Two matched tissue samples from the Autopsy Template should be placed in a tissue cassette, with additional sample types as necessary for viral quantification.From the visceral organs 40 or more sites should be sampled, and from the central nervous system (CNS) 15 or more sites should be sampled, including cerebrospinal fluid (CSF).
Specimens in tissue cassettes will be fixed in neutral buffered formalin (NBF) for approximately 6 hours not to exceed 48 hours.

Submission of biospecimens to the RECOVER Biospecimen Core (PBC)
A centralized biorepository has been selected to process and store biospecimens for this study.All autopsy study sites will collect and prepare biospecimens, including blood, body fluids, snap frozen and formalin-fixed tissue samples, as per the protocol, for shipping to the centralized RECOVER biorepository, where all specimens will be processed and stored.In situ hybridization (ISH), droplet digital polymerase chain reaction (ddPCR), and whole slide imaging (WSI) will be performed centrally.

Post-mortem MRI studies of CNS
It is anticipated that 50% or more of decedents will be subject to examination of the CNS by autopsy, and will be evaluated with post-mortem brain MRI.

Clinical Data
Abstracted decedent medical records will be linked, when possible, to RECOVER cohort and EHR information.The following elements of clinical data will be obtained from review of the decedent's EHR and available data collected from enrollment and follow-up in the RECOVER adult and pediatric cohort studies, if applicable: See more details in the Tissue Pathology Study Protocol.

Primary endpoints
Our statistical analyses will examine the following primary pathological outcomes:

Alignment of data across sources
The majority of the study data (i.e., gross and microscopic findings) will be collected at autopsy according to the standardized Tissue Pathology Study Protocol.Recall that the gross and certain microscopic findings will be entered in the Research Electronic Data Capture (REDCap) system as structured fields, and that the microscopic reports as text field, which will later be systematically coded into SNOMED codes.Additional EHR and other research data (e.g., viral persistence, post-mortem brain imaging) may be extracted, entered into electronic case report forms, and aligned across sites.

Data quality control procedures
Rigorous quality control procedures will be applied, including remote and continuous monitoring using field-and form-level validation at the time of data entry, regular backend monitoring of data with clinical oversight, and training and assistance of lead sites in monitoring of their sub-sites.Network-wide accrual and compliance reports will be provided to investigators to encourage quality of data.Data quality and logic checks will be built into every step of data acquisition, capture, transfer, and integration to assure scientific rigor and to minimize potential errors.Whenever possible, automated logic checks will be built into REDCap.

Summary statistics and data visualizations
The analysis pipeline will include calculating basic summary statistics (e.g., means, SDs, percentiles, counts, proportions), cross-tabulating data (e.g., contingency tables), and performing basic data visualization (e.g., histograms, box plots, scatter plots).Additional visualization tools will be applied for deeper data exploration, which will include results from software packages ggplot2, highcharter, RGL available in R, and PhenoTree (Baytas et al., 2016).
Prior to fitting Frequentist and Bayesian statistical models, thorough exploratory data analysis will be performed to identify outliers, determine reasonable modeling assumptions, and refine existing and formulate new hypotheses and analyses based on any trends and patterns identified.We will make use of standard visual diagnostics (e.g., residual plots, quantile-quantile plots) and/or formal statistical tests (e.g., likelihood ratio test, goodness-of-fit test) to evaluate the adequacy of fitted regression models.

Possible transformation of outcome variables
Outcome variables may be transformed (e.g., Box-Cox power family - Box and Cox, 1964) prior to any statistical modeling.Bayesian models will be used to provide simulation-based inference for transformed outcomes (Rubin, 1978).Original, untransformed values will be used for all summaries.

Overlap assessment
We will assess the overlap in covariate distributions between comparative groups (i.e., acute decedents vs. post-acute decedents without PASC) using the empirical distributions of the estimated propensity scores (e.g., Stuart, 2010; Bind and Rubin, 2019).We will remove "outlier" decedents whose estimated propensity scores do not overlap with the other group of decedents (e.g., Imbens and Rubin, 2015; Bind and Rubin, 2019).

Covariate balance assessment and matched-sampling strategy
Using standard diagnostics (e.g., "Love" plots -Ahmed et al., 2006), we will assess balance in background covariates (e.g., sex, age, race/ethnicity, hospitalization status): (i) between groups of acute decedents and of post-acute decedents without PASC, and (ii) between groups of post-acute decedents with and without PASC.If there is some evidence of covariate imbalance, we will use matched-sampling techniques (e.g., estimated propensity score matching -Rosenbaum and Rubin, 1983) to improve balance (e.g., Stuart, 2010).Note that some decedents' data may be discarded at this stage.Also note that this design stage task will be performed blinded from the outcome data, by a biostatistician who will not be involved in the statistical analysis stage.

Fisherian inference (e.g., hypothesis testing)
We will test sharp null hypotheses and report unadjusted and adjusted (for multiple comparison -Lee et al., 2016) two-sided Fisher p-values.For example, in Aim 1, for the comparison of acute decedents and post-acute decedents without PASC, a sharp null hypothesis will state that for each decedent, the pathological potential outcomes are the same had the decedent been an acute decedent or a postacute decedent without PASC.For the comparison of post-acute decedents with and without PASC, a sharp null hypothesis will state that for each post-acute decedent, the pathological potential outcomes are the same, whether the decedent had PASC or not.For Aim 1, the pathological outcomes include the presence and types of organ injury / disease, pathological features, and distinct phenotypes.2019), we will "not base our conclusions solely on whether an association or effect was found to be 'statistically significant' (i.e., the p-value passed some arbitrary threshold such as p < 0.05)", but rather also explore Fisherian intervals -FIs for unit-level difference in continuous potential outcomes (e.g., Imbens and Rubin, 2015) and for finite sample proportion difference (Volfovsky et al., 2015), and consider model-based inference.

Model-based inference (e.g., estimated point estimates, confidence intervals)
For the comparative (e.g., two-sample) analyses described in the previous section, we will also provide model-based inference (e.g., Neymanian / Frequentist, Bayesian).For continuous variables, we will assess non-linear dose-responses with penalized cubic splines (e.g., mgcv package in R - Wood and Wood, 2015).Robustness of study results to alternative model specifications will be assessed in ad hoc sensitivity analyses.We will define an estimand of interest and provide a point estimate and a corresponding two-sided uncertainty interval (e.g., 95% confidence interval -CI) for that estimand.We will use standard large-sample approximations for estimated variances of estimated effects (e.g., estimated difference in proportions), and also capitalize on Frequentist and Bayesian multivariable regression models implemented in R.

Statistical software
Statistical analyses will be performed using R version 4.1.1.

Characteristics of decedents
Summary statistics, including sample sizes, medians and interquartile ranges for numeric variables, and counts and proportions for categorical variables, will be reported for data of decedents including demographics, comorbidities, vaccination, clinical assessments, and laboratory values (Tables 3-6).We will also categorize decedents according to calendar quarter of enrollment, as well as calendar quarter of initial infection.This will allow assessment of differences in pathologic features that may occur following infection with different SARS-CoV-2 variants, in addition to the time elapsed between initial infection and death.Summaries will be reported overall and stratified by PASC status, decedent status (acute vs. post-acute), hospitalization history (hospitalized vs. not hospitalized for COVID-19), and time of enrollment.P-values corresponding to univariate (unadjusted) two-sample tests of differences in means or proportions: (i) between the groups of acute decedents and of post-acute decedents without PASC, and (ii) between the groups of post-acute decedents with and without symptoms of PASC.

Scientific hypotheses related to Aim 1
The following hypotheses will be addressed in the analytic approach to Aim 1: Hypothesis 1a: Prevalence and types of organ injury / disease, pathological features, and distinct phenotypes differ between acute decedents and post-acute decedents without PASC.
Note that the directionality of some hypotheses is anticipated.For example, we expect more acute diffuse alveolar damage and acute thrombi for acute decedents than for post-acute decedents without PASC.
Hypothesis 1b: Prevalence and types of organ injury / disease, pathological features, and distinct phenotypes differ between post-acute decedents with and without PASC.
Here too, the directionality of some hypotheses is anticipated.For example, we expect more organized diffuse alveolar damage and organized thrombi for post-acute decedents with PASC than for post-acute decedents without PASC.

Overview of analytic approach to Aim 1
We will statistically compare endpoints such as lesions in individual organs (i) between acute decedents and post-acute decedents without PASC, and (ii) between post-acute decedents with and without PASC.The primary analysis will determine if there are pathologies specific to PASC that are associated with tissue / organ / clinical dysfunction.We will also stratify our analyses by calendar quarter of enrollment or by calendar quarter of initial infection to examine possible effect modification by time of enrollment or by time of infection.

Overlap assessment
We will assess the overlap in covariate distributions between comparative groups (i.e., for Hypothesis 1a, acute decedents vs. post-acute decedents without PASC, and for Hypothesis 1b, post-acute decedents with vs. without PASC) using the empirical distributions of estimated propensity scores (e.g., Stuart, 2010; Bind and Rubin, 2019).We will remove "outlier" decedents with estimated propensity scores that do not overlap with the other group of decedents.

Covariate balance and matched-sampling strategies
We will assess balance in background covariates between the two comparative groups (acute decedents vs. post-acute decedents without PASC, and post-acute decedents with vs. without PASC), using standard causal inference techniques (e.g., "Love" plots - Ahmed et al., 2006) and use matchedsampling approaches (e.g., estimated propensity score caliper matching -Rosenbaum and Rubin, 1983) to reduce any covariate imbalance (e.g., Stuart, 2010).At this stage, a hypothetical exposure assignment mechanism consistent with the data and the matched-sampling strategy (e.g., completely randomized for a matched data set designed using the estimated propensity score) can be assumed (Bind and Rubin, 2019; Branson, 2021).

Fisherian inference
We will test the sharp null hypotheses described in Section 3.3.1 and report unadjusted and adjusted (Lee et al., 2016) two-sided Fisher p-values.Recall, for the comparison of acute decedents and postacute decedents without PASC, the sharp null hypotheses will state that for each decedent, the presence and type of organ injury / disease, pathological features, and distinct phenotypes are the same had the decedent been an acute decedent or a post-acute decedent without PASC.Similarly, for the comparison of post-acute decedents with and without PASC, the sharp null hypotheses will state that for each post-acute decedent, the presence and type of organ injury / disease, pathological features, and distinct phenotypes are the same, whether the decedent had PASC or not.
For each outcome (e.g., primary endpoints described in Section 2.5.5),we will calculate Fisher p-values based on the assumed, hypothetical exposure assignment mechanism implied by the data and the matched-sampling strategy (Bind and Rubin, 2019;Branson, 2021).For test statistics, we will use standard ones (e.g., proportion difference, Chi-squared test statistic), and less commonly-used ones (e.g., estimated coefficient from a multivariable regression model - Brillinger et al., 1978;Bind and Rubin, 2019;Sommer et al., 2021).Following the advice of Wasserstein et al. ( 2019), we will "not base our conclusions solely on whether an association or effect was found to be 'statistically significant' (i.e., the p-value passed some arbitrary threshold such as p < 0.05)", but rather also explore Fisherian intervals (e.g., Imbens and Rubin, 2015;Volfovsky et al., 2015) and consider model-based inference.

Model-based inference
For each type of organ injury / disease evaluated, statistical analysis will be limited to the set of decedents who did not have any record of the given organ injury / disease prior to their SARS-CoV-2 infection (the first SARS-CoV-2 infection if there is evidence of multiple infections).Note that for severe inflammation, sepsis, liver, kidney, muscle, and heart failures, the latest laboratory records (e.g., white blood cell count, creatinine levels) prior to the SARS-CoV-2 infection will be used to exclude decedents with prior pathological outcomes from the relevant statistical analyses.Prevalence of organ injury / disease, pathological features, and distinct phenotypes will be estimated using data from the acute and post-acute decedents with and without PASC.We will provide two-sided uncertainty intervals (e.g., 95% CIs) for the difference in proportions using standard large-sample approximations for the estimated variances of estimated proportion differences.Types of organ injury / disease, pathological features, and distinct phenotypes will be similarly compared using univariate, large-sample standard tests (e.g., Chi-squared test).
Pathological outcomes will be also compared between (i) acute decedents and post-acute decedents without PASC, and (ii) between post-acute decedents with and without PASC using simple (for binary outcomes), multinomial (for non-ordinal categorical outcomes), and cumulative (for ordinal outcomes) multivariable logistic regressions, as well as multivariable linear regressions (for continuous outcomes).
We will provide two-sided uncertainty intervals (e.g., 95% CIs) for the estimands involving these types of outcomes.

Scientific hypotheses related to Aim 2
The following hypotheses will be addressed in the analytic approach to Aim 2: Hypothesis 2a: The associations of COVID-19 disease severity and treatment with pathological findings differ in decedents with prior SARS-CoV-2 infection (dying more than 30 days after SARS-CoV-2 infection) with and without PASC.
Hypothesis 2b: The associations of pre-infection and peri-infection risk and resiliency factors (e.g., demographic, biological factors, pre-existing clinical comorbidities) with pathological findings differ in decedents with prior SARS-CoV-2 infection (dying more than 30 days after SARS-CoV-2 infection) with and without PASC.

Overview
We will test hypotheses 2a and 2b.While stratifying on PASC status, we will also estimate conditional associations between (i) COVID-19 disease severity and pathological findings, and between (ii) preinfection and peri-infection risk and resiliency factors (e.g., demographic, biological factors, pre-existing clinical comorbidities, and acute infection treatment) and pathological findings.We will then compare these conditional associations between post-acute decedents with and without PASC.We will also stratify our analyses by calendar quarter of enrollment or by calendar quarter of initial infection to examine possible effect modification by time of enrollment or by time of infection.

Overlap assessment
We will assess the overlap in covariate distributions between the comparative groups (e.g., for Hypothesis 2a, post-acute decedents with severe COVID-19 and with PASC vs. post-acute decedents with severe COVID-19 and without PASC vs. post-acute decedents without severe COVID-19 and with PASC vs. post-acute decedents without severe COVID-19 and without PASC) as described in Section 3.3.3.

Covariate balance and matched-sampling
We will assess covariate balance between groups (e.g., defined by COVID-19 disease severity and PASC status) using standard causal inference techniques (e.g., "Love" plots) and use matchedsampling approaches (e.g., estimated propensity score matching) to reduce any covariate imbalance.

Fisherian inference
We will test the sharp null hypotheses described in Section 3.4.1 and report unadjusted and adjusted (Lee et al., 2016) two-sided Fisher p-values.Recall that for Hypotheses 2, we will examine whether the associations between exposures (e.g., COVID-19 disease severity, risk factors) and pathological findings are modified by PASC status.For example, for the comparison of post-acute decedents with severe COVID-19 and with PASC vs. post-acute decedents with severe COVID-19 and without PASC vs. post-acute decedents without severe COVID-19 and with PASC vs. post-acute decedents without severe COVID-19 and without PASC, the sharp null hypotheses will state that for each post-acute decedent, the pathological findings are the same regardless of COVID-19 severity (from the first SARS-CoV-2 infection if there is evidence of multiple infections) and regardless of PASC status.For each outcome, we will follow the principles of Fisherian inference as described in Section 3.4.5.

Model-based inference
We will examine whether the conditional associations between exposures of interest (e.g., for Hypothesis 2a, COVID-19 disease severity) and pathological findings are different for post-acute decedents with and without PASC by considering multivariable models regressing pathological findings outcomes on the exposures of interest (e.g., for Hypothesis 2a, COVID-19 disease severity) adjusting for background covariates (e.g., age, race / ethnicity) after stratifying by PASC status.If the estimated regression coefficients for the background covariates are similar for post-acute decedents with and without PASC, we will pool all post-acute decedents (i.e., with and without PASC) and consider a regression model estimating, e.g., for Hypothesis 2a, main effects on pathological findings of COVID-19 disease severity and of PASC status, as well as an interaction term between COVID-19 disease severity and PASC status.We will estimate measures of effect modifications (or interactions) and their associated two-sided uncertainty intervals (e.g., 95% CIs) using the relevant multivariable regression models.Note that for Hypothesis 2b, we will estimate measures of effect modifications (or interactions) and associated uncertainty intervals (e.g., 95% CIs) using multivariable models regressing pathological outcomes on pre-infection and peri-infection risk and resiliency factors (e.g., demographic, biological factors, pre-existing clinical comorbidities, and acute infection treatment), again stratifying by or adjusting for PASC status.
3.5 Analytic approach to Aim 3

Scientific hypotheses related to Aim 3
The following hypotheses will be addressed in the analytic approach to Aim 3: Hypothesis 3a: The estimated direct effects of clinical risk factors (e.g., comorbidities, infection severity, hospitalization, intubation, steroid treatment) on pathological findings differ in post-acute decedents with and without PASC.
Hypothesis 3b: The estimated indirect effects of clinical risk factors (e.g., comorbidities, infection severity, hospitalization, intubation, steroid treatment) on pathological findings via viral persistence (investigated systematically via molecular methods) differ in post-acute decedents with and without PASC.

Overview
Risk factors for PASC could include viral persistence.We will examine whether clinical risk factors (e.g., comorbidities, infection severity, hospitalization, intubation, or treatments such as steroids) contribute to viral persistence (measured by ddPCR and RNA in situ hybridization) while stratifying on PASC status.We will conduct mediation analyses to test Hypotheses 3a and 3b and to estimate the direct and indirect effects (and uncertainty intervals such as 95% CIs) of clinical risk factors (e.g., comorbidities, infection severity, hospitalization, intubation, steroid treatment) on pathological outcomes via viral persistence (the possible mediator), in post-acute decedents with and without PASC.If possible, we will also stratify our analyses by calendar quarter of enrollment or by calendar quarter of initial infection to examine possible effect modification by time of enrollment or by time of infection.

Overlap assessment
We will assess the overlap in covariate distributions between the comparative groups (e.g., post-acute decedents with steroid treatment and with PASC vs. post-acute decedents with steroid treatment and without PASC vs. post-acute decedents without steroid treatment and with PASC vs. post-acute decedents without steroid treatment and without PASC) as described in Section 3.3.3.

Covariate balance and matched-sampling
We will assess covariate balance between groups defined by the clinical risk factor and PASC status using standard causal inference techniques (e.g., "Love" plots) and use matched-sampling approaches (e.g., estimated propensity score matching) to reduce any covariate imbalance.

Fisherian inference
We will test the sharp null hypotheses described in Section 3.5.1 and report unadjusted and adjusted (Lee et al., 2016) two-sided Fisher p-values.For example, for the comparison of post-acute decedents with steroid treatment and with PASC vs. post-acute decedents with steroid treatment and without PASC vs. post-acute decedents without steroid treatment and with PASC vs. post-acute decedents without steroid treatment and without PASC in Hypothesis 3a, the sharp null hypotheses will state that for each post-acute decedent, the pathological findings are the same regardless of steroid treatment (related to the first SARS-CoV-2 infection if there is evidence of multiple infections) and regardless of PASC status.For each outcome, we will follow the principles of Fisherian inference as described in Section 3.4.5.

Model-based inference
We will examine whether the conditional associations between the clinical risk factors (e.g., comorbidities, infection severity, hospitalization, intubation, steroid treatment) and pathological findings are different for post-acute decedents with and without PASC by considering multivariable models regressing pathological findings outcomes on the clinical risk factors adjusting for background covariates (e.g., age, race / ethnicity) after stratifying by PASC status.If the estimated regression coefficients for the background covariates are similar for post-acute decedents with and without PASC, we will pool all post-acute decedents (i.e., with and without PASC) and consider regression models estimating main effects of clinical risk factors and of PASC status, as well as interaction terms between clinical risk factors and PASC status.We will estimate measures of effect modifications (or interactions) and their associated two-sided uncertainty intervals (e.g., 95% CIs) using the relevant multivariable regression models.
We will estimate mediated effects and uncertainty intervals (e.g., 95% CIs) from multivariable models regressing: (i) pathological findings outcomes on clinical risk factors (e.g., comorbidities, infection severity, hospitalization, intubation, steroid treatment) and on viral persistence, and (ii) viral persistence on clinical risk factors (e.g., comorbidities, infection severity, hospitalization, intubation, steroid treatment), after stratifying on PASC status (e.g., VanderWeele, 2015).If the estimated regression coefficients for the background covariates are similar for post-acute decedents with and without PASC, we will pool all post-acute decedents (i.e., with and without PASC) and consider regression models with the relevant main effects and interaction terms.
3.6 Analytic approach to Aim 4

Scientific hypotheses related to Aim 4
The following hypotheses will be addressed in the analytic approach to Aim 4: Hypothesis 4: The decedent's PASC status is associated with the presence of new lesion(s) in the CNS.

Overview
We will evaluate data from decedents who had both pre-infection and post-mortem brain MRIs to test Hypothesis 4 and to estimate the conditional association between PASC status and the presence of new lesion(s) in the CNS adjusting for background covariates (e.g., sex, age, race / ethnicity).If possible, We will also stratify our analyses by calendar quarter of enrollment or by calendar quarter of initial infection to examine possible effect modification by time of enrollment or by time of infection.

Overlap assessment
We will assess the overlap in covariate distribution between the two comparative groups (i.e., postacute decedents with vs. without PASC) as described in Section 3.3.3.

Covariate balance and matched-sampling
We will assess covariate balance between post-acute decedents with and without PASC using standard causal inference techniques (e.g., "Love" plots) and use matched-sampling approaches (e.g., propensity score matching) to reduce any covariate imbalance.

Fisherian inference
We will test the sharp null hypothesis in Section 3.6.1.and report unadjusted and adjusted (Lee et al., 2016) two-sided Fisher p-values.For the comparison of post-acute decedents with and without PASC, the sharp null hypothesis will state that for each post-acute decedent, the presence of new lesion(s) in the CNS does not depend on PASC status.We will follow the principles of Fisherian inference as described in Section 3.4.5.

Model-based inference
We will estimate the conditional association (and associated two-sided uncertainty intervals such as 95% CI) between presence of new lesion(s) in the CNS and PASC status from a logistic model regressing the presence of new lesion(s) in the CNS on PASC status adjusting for background covariates (e.g., sex, age, race / ethnicity).

Methods for missing data
Some level of missingness is anticipated for all variables being collected in this study.The primary reason for missing data will be lack of records (e.g., demographics, laboratory values) in the EHR database.Throughout, if baseline values (e.g., demographics) are missing, we will first restrict analyses to decedents with complete data and repeat analyses adjusting for missing data using multiple imputation using Multiple Imputation by Chained Equations (MICE -van Buuren and Groothuis-Oudshoorn, 2011).When it is suspected that missing covariates and outcome variables are missing not at random, we will conduct sensitivity analyses considering a range of possible deviations from the missing at random assumption (e.g., "Tipping point" analysis -Liublinska and Rubin, 2015).

Communications
A Manual of Procedures (MOP) will be provided to study coordinators.The MOP will contain information on: how to screen decedents and obtain informed consent, and how to use the Electronic Data Capture (EDC) system and case report forms (CRFs).Site staff will be prepared for study conduct, including training in the protocol and EDC procedures, review of Good Clinical Practices, and review of Health Insurance Portability and Accountability Act (HIPAA) regulations.The clinical sites will be trained to comply with reporting requirements to promote standardization of study-related procedures.The REDCap EDC system integrates site onboarding, training, study documents, and resources in a central location accessible by authenticated users.A distributed communication network (email list server, website) will be used.The website will have a public section with general information about study protocols and a password-protected section for study staff with study documents (e.g., protocol, MOP), contact information, meeting minutes and agendas, and other important study resources.The RECOVER DRC will monitor screening and enrollment throughout the study.If concerns arise, the DRC and CSC will work with sites to ensure sufficient recruitment.

Observational Study Monitoring Board
Oversight of data and safety for all three RECOVER cohort studies will be provided by a PASC OSMB appointed by National Heart, Lung, and Blood Institute (NHLBI).Relevant to the autopsy study, the OSMB will meet at least twice a year to review data on data quality and study recruitment as described in the committee charter, and make recommendations about study conduct to the NHLBI.As the RECOVER study does not involve any interventions, an early stopping rule for efficacy or futility is not indicated.
After each OSMB meeting, the OSMB determination letter and a summary report of adverse events will be prepared within 30 days and will be distributed by NHLBI staff to each principal investigator and study coordinator for review.The summary report will contain the following information: • A statement that an OSMB review of outcome data and information relating to study performance across all centers took place on a given date; • A statement that a review of recent literature relevant to the research took place; • The OSMB's recommendation with respect to progress or need for modification of the protocol or informed consent.If the OSMB recommends changes to the protocol or informed consent, the rationale for such changes and any relevant data will be provided; • A statement that if concerns are identified, the NHLBI Program Official will communicate these promptly to the investigators.

Figure 1 :
Figure 1: Stages of the Tissue Pathology Study

Table 1 . Sample size for the RECOVER Tissue Pathology Study by PASC status
Aim 1: Characterize the prevalence and types of organ injury/disease, pathological features, and distinct phenotypes in decedents with acute SARS-CoV-2 infection (dying 15-30 days after SARS-CoV-2 infection), and in decedents with prior SARS-CoV-2 infection (dying >30 days after SARS-CoV-2 infection) with and without PASC.

Table 2
provides detectable effect sizes and detectable differences in proportions between decedents with and without PASC for quantitative and dichotomous pathological features.Results are given without and with a multiple comparison adjustment (assuming 50 tests, alpha=0.001as a conservative level of control).