Cardiovascular vulnerability predicts hospitalisation in primary care clinically suspected and confirmed COVID-19 patients: A model development and validation study

Objectives Cardiovascular conditions were shown to be predictive of clinical deterioration in hospitalised patients with coronavirus disease 2019 (COVID-19). Whether this also holds for outpatients managed in primary care is yet unknown. The aim of this study was to determine the incremental value of cardiovascular vulnerability in predicting the risk of hospital referral in primary care COVID-19 outpatients. Design Analysis of anonymised routine care data extracted from electronic medical records from three large Dutch primary care registries. Setting Primary care. Participants Consecutive adult patients seen in primary care for COVID-19 symptoms in the ‘first wave’ of COVID-19 infections (March 1 2020 to June 1 2020) and in the ‘second wave’ (June 1 2020 to April 15 2021) in the Netherlands. Outcome measures A multivariable logistic regression model was fitted to predict hospital referral within 90 days after first COVID-19 consultation in primary care. Data from the ‘first wave’ was used for derivation (n = 5,475 patients). Age, sex, the interaction between age and sex, and the number of cardiovascular conditions and/or diabetes (0, 1, or ≥2) were pre-specified as candidate predictors. This full model was (i) compared to a simple model including only age and sex and its interaction, and (ii) externally validated in COVID-19 patients during the ‘second wave’ (n = 16,693). Results The full model performed better than the simple model (likelihood ratio test p<0.001). Older male patients with multiple cardiovascular conditions and/or diabetes had the highest predicted risk of hospital referral, reaching risks above 15–20%, whereas on average this risk was 5.1%. The temporally validated c-statistic was 0.747 (95%CI 0.729–0.764) and the model showed good calibration upon validation. Conclusions For patients with COVID-19 symptoms managed in primary care, the risk of hospital referral was on average 5.1%. Older, male and cardiovascular vulnerable COVID-19 patients are more at risk for hospital referral.


Introduction
Coronavirus disease 2019 , caused by SARS-CoV-2, has led to a global pandemic ever since the first cases were described in late 2019. Despite clear improvement in terms of vaccination efficacy and treatment options for hospitalised patients, COVID-19 remains a global public health burden. For instance, at present, global vaccination coverage is still low in many countries (particularly in low-and middle-income countries). Moreover, waning immunity after vaccination is already observed, notably in vulnerable individuals. In addition, new emerging virus variants-like omicron-escape protective immunity following vaccination, and finally, a substantial part (5-10% in highly vaccinated countries) of the population refuse vaccination altogether for a variety of reasons. Thus-inevitably-global circulation of SARS-SoV-2 will remain with new (seasonal) outbreaks likely to occur. COVID-19 will keep influencing the organisation of healthcare worldwide in the upcoming years, perhaps decades.
Thereby, it is pivotal to increase our knowledge on this relatively new disease, for instance in order to learn how to orchestrate the flow of patients depending on the expected course or natural prognosis of the disease. Indeed, a fast-growing amount of studies evaluated prognostic factors for prognosticating COVID-19 patients. However, most of these studies focus on an inhospital population, with only few focusing on outpatient management [1]. This is unfortunate as the clinical presentation of (suspected) COVID-19 starts-off with initially mild to moderate symptoms in the first week of illness, and only in some with progression to hypoxemia for which hospital (or even ICU) admission is needed [2]. If such deterioration occurs, patients in the Netherlands, as in many countries worldwide, are seen in a primary care setting. The primary care physician is therefore often the first to decide on the need and optimal timing for more impactful measures, such as intensified monitoring, prescription of budesonide inhalation or perhaps novel virus inhibitors, or ultimately referral for hospitalisation [3,4]. Unfortunately, evidence based tools or knowledge to help primary care physicians on deciding how to triage COVID-19 patients and detect patients in need for referral from those in whom a relatively benign trajectory is to be expected are currently lacking.
From studies done in hospitalised COVID-19 patients, we know that underlying cardiovascular diseases are strong predictors for further disease deterioration towards ICU admittance or death [5][6][7][8][9], an association that remains relevant also in vaccinated populations [10]. If preexisting cardiovascular disease and/or diabetes also increases the risk of clinical deterioration already in primary care, this likely is instrumental to guide and orchestrate outpatient management in COVID-19. The aim of this study was therefore to determine the prognostic incremental value of cardiovascular vulnerability-defined by the number of cardiovascular diseases and/or type 2 diabetes mellitus-in predicting the risk of escalation of care (i.e. hospital referral) in primary care patients with clinically suspected or confirmed COVID-19.

Study design
This study involves an analysis of anonymised observational electronic medical record data of community people registered by the primary care physician with either confirmed or clinically suspected COVID-19. We assessed the incremental value of cardiovascular disease and/or diabetes by developing a prognostic prediction model in a cohort of patients from the 'first wave' of COVID-19 infections in the Netherlands (March 1 2020 to June 1 2020) that was temporally validated in a cohort of patients from the 'second wave' of infections in the Netherlands (June 1 2020 to April 15 2021). Where appropriate for this study, we adhered to the TRIPOD guideline for reporting prediction models [11].

Databases
Patients were included from three similar ongoing and dynamic primary care databases run by the academic hospitals of the cities and surrounding municipalities of Utrecht and Amsterdam, containing pseudonymised medical data of approximately 850,000 patients in total: the Julius General Practitioner's Network (JGPN) University Medical Center Utrecht, the Academic Network of General Practice at VU University medical center in Amsterdam (ANH VUmc), and the Academic General Practitioner's Network at Academic Medical Center Amsterdam (AHA AMC) [12][13][14]. Two databases (JGPN and ANH VUmc) were used to identify patients for the development of the prediction model (i.e. development cohort) and all three databases (JGPN, ANH VUmc and AHA AMC) were used to identify patients for the temporal validation (i.e. the validation cohort).

Study population and data collection
Detailed information on how data were collected for our study population is described in S1. In short, patients for the development cohort were included from March 1 2020 to June 1 2020 (the 'first wave' of the COVID-19 pandemic in the Netherlands). During this time period, very limited polymerase chain reaction (PCR) testing for COVID-19 was available, and moreover mainly restricted to more severe hospitalised cases. Consequently, many symptomatic patients consulting their primary care physician with highly suggestive of COVID-19 were not tested. We therefore included all consecutive adult patients aged 18 years or older, who visited their primary care physician with confirmed or suspected (based upon clinical symptoms) COVID-19.
For the validation cohort, consecutive adult patients from the 'second wave' of COVID-19 infections were included (data from June 1 2020 until April 15 2021, see S1). At this point in time, the Dutch government made PCR COVID-19 tests freely available and these were recommended for all symptomatic subjects in the Netherlands and for those who were in close contact with a confirmed COVID-19 patient. Moreover, at that time GPs were instructed to uniformly code confirmed cases in their medical records using standardized coding. Thus, only confirmed COVID-19 cases were included in the cohort for validation of the model.

Outcome
The primary outcome of the prediction model in this study was referral to an emergency ward for intended hospital admission. This was defined as any clinical deterioration resulting in hospital referral by the primary care physician that was recorded as such in the consultation annotation (free text) of the medical record. To capture the full spectrum of complications of COVID-19 resulting in hospitalisation, follow-up lasted 90 days after first consultation for COVID-19 suspected symptoms. To this end, all anonymised consultation texts were manually screened for any emergency hospital referral by (primary care) clinical scientists (FSvR, LPTJ, SvD, and GJG) and cases of doubt were discussed, until consensus was reached.

Candidate predictors
Based upon existing literature from hospitalised COVID-19 patients, we a-priori specified the following candidate predictors prior to the analysis phase: age, sex, the interaction between age and sex, and the number of cardiovascular diseases. The latter was defined as (history of) type 2 diabetes mellitus, heart failure, coronary artery disease, peripheral arterial disease, stroke/ transient ischemic attack (TIA), venous thromboembolism (pulmonary embolism or deep venous thrombosis; VTE), and/or atrial fibrillation (AF). Presence of these diseases was based on the corresponding disease coding (S1 Table) at any point before the first COVID-19 consultation in the patient's medical record. The number of cardiovascular diseases were counted per patient and categorised into: no cardiovascular disease, one cardiovascular disease, or two or more cardiovascular diseases.

Sample size
The model development cohort yielded 5,475 eligible patients with an event fraction of 0.068 (6.8%, n = 373) for the primary outcome referral to the hospital. Prior to prediction analysis, the number of allowed candidate predictors was determined. Based on the proposed calculation for sample size in prediction modelling by Riley et al. [15], the maximum number of candidate predictors that can be modelled was 30 with a R 2 Cox-Snell (R 2 cs) of 0.0495. As this R 2 cs was estimated in absence of a known value, varying R 2 cs from 0.0395 to 0.0595 yielded a minimum of 24 and a maximum of 37 candidate predictors, including interaction terms. By using the candidate predictors age, sex, the interaction between age and sex, and the number of cardiovascular diseases with three categories, the sample size of 5,475 eligible patients was deemed sufficient and large enough for model development.

Missing data
Candidate predictors age, sex and cardiovascular disease had no missing data. Missing values in baseline of characteristics measurements of CRP, BMI and oxygen saturation level were not imputed as these determinants were not used further in predictive modelling.

Statistical analyses
Baseline characteristics were summarised using descriptive statistics with categorical variables as numbers with percentages and continuous variables as means with standard deviations or medians with interquartile ranges (IQR). A multivariable logistic regression modelling approach was used to explore the predictive value of cardiovascular disease and/or diabetesbeyond age and sex-on COVID-19 prognosis. Hereto, all included patients were entered in a fixed model with the predictors i) age, ii) sex, iii) the interaction between age and sex, and iv) the categorical number of cardiovascular diseases and/or diabetes as a dummy variable (with 'no cardiovascular disease' as reference category); i.e. the full model. Next, a second model-i.e. the simple model-was fitted using only the predictors i) age, ii) sex, and iii) the interaction between age and sex. In both models, age was considered as a continuous variable and was studied using a restricted cubic spline function to account for possible non-linearity with 4 knots on the percentiles 0.05, 0.35, 0.65 and 0.95 [16]. The incremental prognostic value of the number of cardiovascular diseases and/or diabetes was assessed by comparing the full and simple model's c-statistics (ΔAUC), Cox-Snell R 2 cs (ΔR 2 cs), and a likelihood ratio test (alpha of 0.05 for significance). The models were internally validated using Harrell's bootstrapping with 100 repetitions to obtain optimism corrected estimates of the c-statistic, and R 2 and slope were calculated. For the temporal external validation, calibration and discrimination were evaluated: observed and predicted events were calculated and depicted in calibration plots and for discrimination areas under the curve (AUC/c-statistic) were calculated. Other performance measures for temporal external validation that were calculated are: calibration slope, calibration intercept, calibration in the large, R 2 cs, and Brier score. Brier scores assess the overall goodness of fit of models, with smaller numbers indicating better performance. Confidence intervals for c-statistics were obtained using the Delong method. For R 2 cs and Brier score confidence intervals, bootstrapping was used with repetitions set at 1000. Validation was done in the whole validation dataset as well as separately in the JGPN, ANH VUmc, and AHA AMC validation cohorts. All statistical analyses were performed in R version 4.0.3 with R base, rms, pROC, DescTools, and rmda packages [17][18][19][20][21].

Ethics
This research was conducted in accordance with Dutch law and the European Union General Data Protection Regulation and according to the principles of the Declaration of Helsinki. The need for formal ethical reviewing was waived by the local medical research ethics committee of the University Medical Center Utrecht, the Netherlands as the research did not require direct patient or physician involvement. The JHN, ANH VUmc and AHA AMC databases may be used for scientific purposes and contain pseudonymised routine care data from the EMRs of all patients of the participating general practices, except those patients who objected to this. Anonymised datasets were extracted from these databases by the respective data managers for the purpose of this research.

Patient characteristics
Patient characteristics of the (clinically suspected and confirmed COVID-19) development cohort are described in Table 1. There were 5,475 patients included in this cohort: 2,825 from JGPN and 2,650 from ANH VUmc. In ANH VUmc, 71.5% were coded as R74, 10.7% as R81, and 19.2% as R83. Differences in patient characteristics between both datasets in the development cohort were minor. Around a quarter of patients suffered from one or more cardiovascular disease, most often type 2 diabetes and coronary artery disease.
Patient characteristics of the (confirmed COVID-19) validation cohort are described in Table 2. From the total of 16,693 patients in the validation cohort 5,420 originated from JGPN, 4,989 from ANH VUmc, and 6,284 from AHA AMC. The patient characteristics in these three datasets were very similar. Around 15-20% suffered from one or more cardiovascular disease, again most often type 2 diabetes and coronary artery disease.

Model development and internal validation
All 5,475 patients in de development cohort were used for model development. 373 patients (6.8%) had the outcome hospital referral. All predefined model regression coefficients of the full and simple models with confidence intervals are shown in Table 3. The apparent c-statistic of the full model was 0.693 (95%CI 0.665-0.721) and the internally validated c-statistic was 0.688 (95%CI 0.660-0.716). The apparent c-statistic of the simple model was 0.681 (95%CI 0.653-0.710) and the internally validated c-statistic was 0.680 (95%CI 0.652-0.708). The full and the simple model are compared in Table 4. The full model performed significantly better than the simple model (p-value for likelihood ratio test, χ 2 = 19.5, df = 2, p<0.001) . Fig 1 gives a visual representation of the full model showing the predicted risks of hospital referral as a function of (increasing) age, stratified by sex and by the number of cardiovascular diseases and/or diabetes. Overall risks are higher for male patients and increase with age. Furthermore, a higher risk is observed in patients with underlying cardiovascular disease.

Temporal external validation
Predicted risks were overall slightly higher than the observed risk (6.2% versus 4.6%) and the calibration slope was 1.36. Overall discrimination showed an AUC of 0.747 (95%CI 0.729-0.764). Performance measures based on the full validation cohort and stratified by database are shown in Table 5. The overall calibration plot and the calibration plots per database separately are shown in S1 Fig. The hospital referral prevalence was lower in the validation datasets than in the development datasets (4.7% versus 6.8%).

Discussion
Cardiovascular vulnerability is a predictor of hospital referral in a population of 5,475 consecutive adult patients in primary care with confirmed or clinically suspected COVID-19 in the 'first wave' of infections in the Netherlands. This finding was confirmed by temporal validation in a population of 16,693 consecutive confirmed COVID-19 adult primary care patients in the 'second wave', exemplifying the robustness of our inferences. On average, in the combined data from the first and second wave (n = 22,168 confirmed and clinically suspected primary care COVID-19 patients), 5.1% was referred to the hospital for considering admission. A model including the number of cardiovascular conditions and/or diabetes (0, 1, or �2) in addition to age and sex and the interaction between age and sex, showed moderate to good performance and demonstrated consistent and good discrimination and calibration upon temporal external validation. The model showed a c-statistic of 0.747 (95%CI 0.729-0.764). Although most (vaccinated) COVID-19 patients experience a favourable prognosis without the need for referral for hospital care, studies on COVID-19 are mainly focussed on those seen in the hospital setting. While on average the overall risk for hospital referral in this adult primary care cohort with COVID-19 was low (5.1%), it is much higher than the hospitalisation rate for other lower respiratory infections in primary care which is estimated at approximately 1% of the adult population affected [22]. In our study, age, sex and the number of concurrent cardiovascular conditions and/or diabetes predicted patients at far greater risk of hospital referral. In fact, for female patients without cardiovascular diseases or diabetes, the risk of hospital referral was well below 10% even in the eldest elderly (aged 80+). Contrastingly, in the presence of cardiovascular diseases and/or diabetes, patients experience higher risks already at younger ages, notably males. For instance, a male patient with two or more underlying cardiovascular diseases and/or diabetes, had a predicted risk of 15% already at the age of around 57 years and this predicted risk will even further increase to above 20% from the age of 80 onwards. This indicates the incremental effect of cardiovascular diseases and/or diabetes in

Comparison with existing literature
Our findings overall confirm those from previous studies done in the hospital setting where age and male sex are important predictors for disease progression towards the endpoints ICU admission or death [1,[23][24][25]. Social, behavioural, comorbidity and biological differences (ACE2 expression, sex-hormones, X-chromosome exposure) between male and female sexes all might contribute to the higher risks of COVID-19 progression observed in males, although probably not all mechanisms have been fully elucidated yet [26,27]. Also, in hospitalised patients, it has been demonstrated that there is an association between cardiovascular disease and COVID-19 complicated disease trajectories, with higher prevalence of cardiovascular  disease and diabetes described in those with critical illness [5][6][7][8][9]28]. Our study shows that this prognostically unfavourable effect is already present much earlier on in the COVID-19 disease course, at the start of symptoms in primary care. This is in line with previous research, where this additive effect of (cardiovascular) comorbidities was also described by the 4C Mortality Score [29]. In this study, the authors demonstrated that the number of comorbidities, importantly including cardiovascular comorbidities, had a more predictive effect than taking only individual co-morbidities in predicting in-hospital mortality of COVID-19 patients [29]. Furthermore, there are two large community-based prediction studies also highlighting the importance of cardiovascular comorbidities as a predictor in the community COVID-19 population. The QCOVID model that was developed in the UK and recently validated in a vaccinated population, was based on data from primary care and showed a c-statistic >0.9 for the primary outcome time to death from COVID-19. The domain of that study, however, covered the whole general population regardless of COVID-19 diagnosis and therefore this can best be interpreted as the risk prediction of getting infected with COVID-19 and subsequently having complications from COVID-19. Thus, the aim of this model was to inform UK health policy and support interventions to manage COVID-19 related risks, rather than inform medical decision making during patient consultations in confirmed or clinically suspected COVID-19 cases [10,30]. With only 0.07% with the outcome death, and thus very low a priori chance, the c-statistic 'misleadingly' moves towards 1.0. Another similar public health based UK study in patients with and without COVID-19 identified determinants that were associated with COVID-19 related death in the OpenSA-FELY primary care database by linking primary care records to reported COVID-19 related deaths. It found the most predictive clinical determinants to be increasing age, male sex, type 2 diabetes mellitus, and cardiovascular disease, similar to our findings [31]. While the domain notably differs between patients seeking primary care for COVID-19 symptoms in our study and the adult community as a whole in these studies form the UK, all draw similar conclusions on the increased risk of clinical deterioration in patients with (multiple) cardiovascular disease and/or diabetes.

Strengths and limitations
This research contributes to the evidence-based prognostication of community COVID-19. We were able to use routine primary care databases capturing both the 'first' and 'second' wave of COVID-19 infections in the Netherlands. We used state-of-the-art methodology including external temporal validation to predict clinical deterioration in a patient population that is currently understudied. The developed statistical model is not intended to be used as a clinical prediction algorithm in primary care. Conversely, the model served as a tool to explore and quantify the predictive value of cardiovascular disease and/or diabetes in the primary care COVID-19 domain. For full appreciation of our findings, however, some limitations also need to be addressed. First, the model was developed in a dataset with a low event fraction of the outcome hospital referral. Yet the number of hospital referral events did allow us to perform robust multivariable regression techniques. Second, there are limitations to using routine care registry data that could have resulted in misclassification of the study population, predictors and outcome, and most importantly it has the risk of missing values. For example, uncertainty concerning COVID-19 infection status may exist (primarily in the first wave) as COVID-19 PCR test results were not automatically linked to the primary care electronic medical records. However, the model proved its transportability in primary care patients in a different time period with satisfactory calibration and discrimination, during a time window where PCR testing was widely performed. Furthermore, the outcome hospital referral was based upon a rigorous manual extraction of medical records by pairs of researchers, albeit actual hospital admittance was not formally confirmed based upon linkage to hospital records. Additionally, there are differences between our development and validation population: the patients from the 'first wave' are all symptomatic patients that visited their primary care physician for symptoms suggestive of COVID-19, while the patients from the 'second wave'-due to government recommendation for individuals to get tested even in the circumstance of only mild symptoms-also include more healthy people that just informed their primary care physician of their positive COVID-19 PCR status. This could also explain the lower event fraction in the validation set (4.7% versus 6.8% in the development population). Furthermore, the model still has to show its robustness in the COVID-19 vaccinated population, although it is likely that existing risk factors will still be present even if the risk of complications is lowered due to vaccination [10]. Finally, the incremental value of the number of cardiovascular diseases and diabetes on prognosticating COVID-19 was assessed in different ways; although we did observe a highly significant change in the likelihood ratio test, the delta in c-statistic and R 2 cs was only small to modest. Possible reasons for this include the overall low risk of hospital referral in most patients in our cohort, as well as that most patients (80.2%) in fact in our cohort did not suffer from cardiovascular diseases and/or diabetes. It has been widely acknowledged that, notably in such scenario's, a change in e.g. the c-statistic is difficult to achieve.

Clinical implications
The readily availability of the chosen primary care predictors and the clinical applicability may provide great advantages for risk profiling patient with suspected or confirmed COVID-19 in the primary care and community setting. This can have several important clinical and public health implications. First, it may be possible to identify patients that will benefit from closer monitoring and frequent follow-up at home by predicting the risk of clinical deterioration early on in the COVID-19 disease course. By intensified monitoring of higher risk patients, critical illness may be detected earlier, potentially improving prognosis. Second, risk prediction could also support advanced care planning. Informing both patients and physicians on the risk of severe illness, may help in anticipating a more stringent or more lenient management. Last, risk profiling may be used for targeting preventive measures. Additionally, experimental regiments to treat symptomatic COVID-19 may be addressed to high-risk patients that may benefit most. Examples include for instance treatment with budesonide, colchicine or novel virus inhibitors; such treatment options likely benefit patients most at higher prior probability of having an adverse prognosis [3,4,32]. Nevertheless, in the end, risk prediction in primary care has to prove its value in daily practice at the background of changing characteristics of this challenging COVID-19 pandemic and influences of vaccination and virus mutations. We however do hope that prognostic studies, like ours, may aid physicians and policy makers by making informed, evidence-based decisions and thereby improve patient outcomes.

Conclusion
In this primary care population-based study, risk of clinical deterioration leading to hospital referral after suspected or confirmed COVID-19 was on average 5.1%. This risk increased with age and was higher in males compared to females. Importantly, patients with concurrent cardiovascular disease and/or diabetes had higher predicted risks and therefore, cardiovascular disease is a predictor of clinical deterioration in the primary care COVID-19 domain. Identifying those at risk for hospital referral could have clinical implications for COVID-19 early disease management in primary care.
Supporting information S1 Fig. Calibration plots in individual databases. Fig a. Calibration plot in the total validation cohort with hospitalisation as the outcome.