Characteristics of patients in platform C19, a COVID-19 research database combining primary care electronic health record and patient reported information

Background Data to better understand and manage the COVID-19 pandemic is urgently needed. However, there are gaps in information stored within even the best routinely-collected electronic health records (EHR) including test results, remote consultations for suspected COVID-19, shielding, physical activity, mental health, and undiagnosed or untested COVID-19 patients. Observational and Pragmatic Research Institute (OPRI) Singapore and Optimum Patient Care (OPC) UK established Platform C19, a research database combining EHR data and bespoke patient questionnaire. We describe the demographics, clinical characteristics, patient behavior, and impact of the COVID-19 pandemic using data within Platform C19. Methods EHR data from Platform C19 were extracted from 14 practices across UK participating in the OPC COVID-19 Quality Improvement program on a continuous, monthly basis. Starting 7th August 2020, consenting patients aged 18–85 years were invited in waves to fill an online questionnaire. Descriptive statistics were summarized using all data available up to 22nd January 2021. Findings From 129,978 invitees, 31,033 responded. Respondents were predominantly female (59.6%), white (93.5%), and current or ex-smokers (52.6%). Testing for COVID-19 was received by 23.8% of respondents, of which 7.9% received positive results. COVID-19 symptoms lasted ≥4 weeks in 19.5% of COVID-19 positive respondents. Up to 39% respondents reported a negative impact on questions regarding their mental health. Most (67%-76%) respondents with asthma, Chronic Obstructive Pulmonary Disease (COPD), diabetes, heart, or kidney disease reported no change in the condition of their diseases. Interpretation Platform C19 will enable research on key questions relating to COVID-19 pandemic not possible using EHR data alone.


Introduction
The COVID-19 pandemic added further burden on already stretched healthcare systems, including the routine management of patients with chronic diseases [1][2][3][4]. Primary care practitioners are especially affected as the pandemic limits their access to information regarding the health status of their patients (1).
Due to the pandemic emergency arrangements, important information gaps and inconsistencies emerged within electronic health records (EHR). Such gaps include the initial lack of codes for tests, diagnosis, and symptoms test results, remote consultations with NHS111 telephone service (the UK non-emergency medical care hotline), hospital consultations, as well as self-treatment or home testing. To understand the COVID-19-vulnerable population and clinical outcomes, key information is required on co-morbidities, medication and patient behavior such as shielding, isolation, and health-seeking behavior, information on mental health, and status of pre-existing chronic diseases.
In response to the pandemic, Observational and Pragmatic Research Institute (OPRI) Singapore and OPC (Optimum Patient Care) UK established the COVID-19 Quality Improvement (QI) program (https://optimumpatientcare.org/covid-qi/). QI programs refer to systematic sets of actions conducted to improve the quality of patient care [5,6]. The COVID-19 QI program aims to assist primary care in assessing the incidence and impact of COVID-19 and to identify patients at-risk for COVID-19 or with suggestive symptoms who are in need of help [7]. The COVID-19 QI program also aims to increase our understanding of the pandemic such as the risk predictors of long COVID [8], the impact of the pandemic on other chronic diseases, and other potential impacts of the pandemic not yet known due to reduced healthcare consultation during the pandemic. These will be achieved by the generation of patient and practice level quality improvement reports containing key patient information combining deidentified electronic health data and patient questionnaire responses to participating practices.
Using data collected as part of this QI program, OPC established Platform C19 (https:// opcrd.co.uk/platform-c19-new-2/) on September 23, 2020. Platform C19 is a novel primary care, patient-centered research platform supported by a multi-organizational research collaboration. The platform incorporates both EHR data and patient self-reported questionnaire results thereby providing a unique and comprehensive data source to facilitate multifaceted research into the COVID-19 pandemic [9]. The current article describes the development of the platform, the data collected within the platform C19, and summarizes the demographics, clinical characteristics, and behavioral patterns of the patients from the first wave of the pandemic. The impact of the pandemic on selfreported physical and social behavior, mental health, and status of chronic diseases is also described.

Design
This paper describes an observational, descriptive analysis of data stored in Platform C19 collected from 7 th August 2020 up till 22 nd January 2021. The analysis includes both EHR data and patient-reported information collected via questionnaire contained within Platform C19.

Development of the platform and data collection
The OPC possesses an established track record and technical expertise in maintaining large scale research databases. We maintain the Optimum Patient Care Research Database (OPCRD), an anonymized, quality-controlled, longitudinal primary care database integrating records for more than 12 million patients from over 800 GP practices across the UK supplemented with linked patient reported outcome data from over 70 thousand patients (https:// opcrd.co.uk/our-database/). Bringing our expertise, we established Platform C19 deriving EHR data from the OPCRD. The OPCRD has received NHS research ethics committee (REC) approval to provide anonymized data for scientific and medical research since 2010, with its most recent approval in 2015 (NHS HRA REC ref: 15/EM/0150). OPCRD is governed by the Anonymized Data Ethics and Protocols Transparency (ADEPT) committee (ADEPT1720), and the protocol of Platform C19 was approved by the ADEPT committee on 18th November 2020. The OPCRD routinely receives de-identified EHR data from GPs participating in OPC QI programs.
Access to the deidentified patient data used in this study may be requested via the OPCRD website (https://opcrd.co.uk/platform-c19-new-2/) or via the enquiries email info@opcrd.co. uk.
All patients aged 18-85 years old at the start of the pandemic (1 st March 2020) registered within practices participating in the COVID-19 QI program were invited in waves to complete an online COVID-19 questionnaire via text-message from their practice starting from 7 th August 2020. The questionnaire was designed by an expert consensus from the Platform C19 steering committee. The questionnaire covers the patient demographics, COVID-19 symptoms, diagnosis, and testing, pandemic related behavior, mental health, chronic health conditions, and detailed questions relating to asthma and COPD. The current version of the questionnaire is provided in the supplementary material. Every patient has a unique link that is attached to the electronic medical record and patient ID. This unique patient link can only be unlocked within the originating practice site. The questionnaire is sent to the patients via the medical record system of the participating practices. Responses are automatically recorded and linked to the patient ID. Data is securely stored in an enterprise database running SQL-Server (Windows Server 2019 Standard (10.0) Version 15.0.2080.9). The list of variables collected within the platform and the sources are available in the online supplementary material (S1 Table).

PLOS ONE
To prepare data for analysis, questionnaire responses and relevant electronic medical records were collated, cleaned, and then summarized. If a patient had responded to more than one round of questionnaires, only the most recent response was used to include the most updated patient status into the analysis. This analysis includes patients who had responded to the COVID-19 questionnaire. Patients who opt-out from data-sharing for research [10] are excluded.

Study variables
De-identified electronic health record (EHR) data and patient-reported COVID-19 questionnaire results were used in this study. Variables derived from the EHR include sex, age, frailty, and smoking status. Variables supplemented from patient-reported data include ethnicity, Body Mass Index (BMI), COVID-19 status, patient behavior, mental health status, and status of underlying chronic diseases. Asthma and COPD exacerbation were defined as requiring a course of at least 3 days of oral corticosteroid due to worsening of the condition. Respondents were considered to have had long COVID if they report symptoms lasting at least 4 weeks [11][12][13][14].

Statistical analysis
Descriptive statistics were computed for all demographic and clinical variables using all available patient data. Categorical variables were presented as numbers and percentages. Percentages are presented according to the number of patients who answered the questions relevant to each specific variable.

Response summary
A total of 129,978 questionnaire invitations were sent to all eligible patients from 14 participating general practices. Of these, 31,033 (23.9%) responded to the questionnaire (Fig 1). The average time between responses and the start of the pandemic (1 st March 2020) was 31.8 weeks. The average Index of Multiple Deprivation of participating practices was close to the national average with slight skew to the deprived (13231.38) and good coverage of all regions across England.

Patient demographics and characteristics
The demographics and characteristics of the respondents are summarized in Table 1
Most patients with asthma and/or COPD reported not having experienced any exacerbation in the past 12 months (3,162/4,185; 75.6%). Only a small proportion reported having been hospitalized (175/4,156; 4.2%) or treated in an emergency department (272/4,186; 6.5%) due to the worsening of their conditions (Fig 5).

Summary of findings
In the current article, we describe the patient characteristics available within Platform C19. Platform C19 is a large population-based platform linking patient reported data linked to

PLOS ONE
Patient characteristics in platform C19, a COVID-19 research database electronic medical records. The platform combines OPCRD's unique access to anonymized EHR data with patient-reported information, providing additional insight on data that are not usually collected during routine clinical visits. These include data on patient behavior, potentially missed COVID-19 patients, changes in physical activity, and mental health. The platform

PLOS ONE
thus provides a unique data source to answer important research questions on the COVID-19 pandemic which cannot be answered using only routine primary care EHR data.
Only a quarter of the respondents reported having been tested for COVID-19. The proportion of respondents who reported positive test results (8%) was observed to be slightly higher than the UK national data as of 22 nd January 2021 (56%, https://coronavirus.data.gov.uk/). Additionally, more than half reported not having sought medical attention for their COVID-19 infection or symptoms, This might have stemmed from patient reluctance to seek medical attention when they have only mild complaints. These imply that some COVID-19 cases will not be identified in EHRs.
Respondents mostly reported decreases in the frequency of interaction with family or friends and the frequency of leaving their houses since the pandemic. However, 42% and 24% of respondents reported no change or an increase in physical activities respectively despite the pandemic. This is consistent with a previous study by Spence et al that reported the majority (57%) of 1,521 adults in the UK responded to have either maintained or increased their physical activities level during the lockdown [15]. Despite this, the authors also report that the proportion of respondents who did not meet guideline recommended levels of physical activity increased during the pandemic.
Studies have consistently reported the impact of the pandemic on population mental health [16][17][18]. This is similarly observed among our respondents as around one-third reported some degree of negative impact in the questions regarding mental health. This observation implies a major impact of the pandemic which reaches beyond physical disturbances and highlights a need for public health policies to assist the population especially the vulnerable [19].
The pandemic disrupted the delivery of routine primary care including patients with chronic diseases [3]. However, the majority of the respondents reported no change to the status of their chronic disorders during the pandemic. Most respondents with asthma and/or COPD also reported that their symptoms had been well-controlled in the past year. These findings suggest that the majority of patients with chronic diseases may not have required face-to-face consultation with their general practitioners during the first wave of the pandemic and suggests that use of remote methods such as text message generated questionnaires or telephone reviews could support monitoring of chronic diseases including asthma and COPD.

Further studies using data in the C19 Platform
Platform C19 is an open platform designed to facilitate collaboration and sharing of research data and resources. OPC invites collaborators to conduct crucial research on COVID-19 using the breadth of data stored in the platform.
Potential research includes analysis of the gaps in EHR recorded COVID-19 compared to the self-suspected COVID-19. Analyses of patients with self-suspected COVID-19 infection are hoped to shed information on this patient group.
Patients with chronic diseases may choose to shield during this period, which may then negatively impact their activity and quality of life [20]. Previous studies have also reported reduction in asthma and COPD related health visits since the start of the pandemic in the UK [21][22][23]. Whether this is due to real reductions in asthma and COPD exacerbations or due to changes in health-seeking behavior warrants further investigations. Future studies using the platform will therefore analyze the impact of shielding and other patient behavioral patterns during the pandemic towards the outcome of pre-existing chronic diseases such as asthma and COPD. Conversely, further studies will also investigate the influence of pre-existing chronic diseases on patient behavior and COVID-19 infection risk and outcome. The impact of the lockdown on the well-being of at-risk patients such as the frail, elderly, and those with multiple co-morbidities also needs to be elucidated.
Currently, there are gaps in our knowledge regarding symptoms of COVID-19 which persists after the infection, a phenomenon termed" long COVID" [24][25][26]. A deeper study into the subgroup of patients with long COVID identified within Platform C19 is therefore warranted to identify the potential risk factors for long COVID. This will include investigation into whether patients who self-suspected and self-managed their COVID-19 symptoms subsequently develop long COVID.
Platform C19 aims to continue data collection in the next 12 months and we expect a doubling in the number of data collected by April 2021. As the pandemic evolves, continued data collection will allow long-term follow up of patients with COVID-19. This will enable research on the impacts of new variants of the virus and the effectiveness of vaccination and other interventions including lockdowns and self-isolation.
Future rounds of questionnaire may aim to collect additional information on lockdown deconditioning (weakening of muscles due to decrease in physical activities during the lockdown), healthcare avoidance, and the impact of vaccination of their physical and mental health while subsequently increasing the representation of the Black, Asian, and minority ethnic patients within the platform. Such information will be valuable to provide recommendations for public health practitioners during a global pandemic. Platform C19 will also aim for the opportunity to link its data with secondary care data.

Conclusions
Platform C19 is a unique research database which supplements routine EHR data with patient-reported information and outcome such as isolation behavior, mental health, and health-seeking behavior. This dataset enables researchers to answer key research questions on the pandemic which cannot be answered from routinely collected EHR data alone. We currently observe that majority of patients with chronic disease had their symptoms controlled.
Respondents also reported some degree of impact on their activity and mental health due to the pandemic. Future studies conducted using Platform C19 will seek to better our understanding of the relationship between co-morbid chronic diseases, patient behavior, and COVID-19 outcome as the pandemic evolves.