Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Unveiling the hidden burden of COVID-19 in Brazil’s obstetric population with severe acute respiratory syndrome: A machine learning model

  • Veridiana Freire Franco ,

    Roles Conceptualization, Writing – original draft, Writing – review & editing

    veridiana.franco@hc.fm.usp.br

    Affiliation Departamento de Obstetricia e Ginecologia da Faculdade de Medicina da Universidade de Sao Paulo, Sao Paulo, Brazil

  • Tatiana Assunção Zaccara,

    Roles Writing – original draft, Writing – review & editing

    Affiliation Divisão de Clínica Obstétrica do Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo, Sao Paulo, Brazil

  • Ornella Scardua Ferreira,

    Roles Data curation, Writing – review & editing

    Affiliation Departamento de Estatistica da Universidade Federal do Espirito Santo, Vitoria, Espírito Santo, Brazil

  • Rafaela Alkmin da Costa,

    Roles Conceptualization, Writing – review & editing

    Affiliation Departamento de Obstetricia e Ginecologia da Faculdade de Medicina da Universidade de Sao Paulo, Sao Paulo, Brazil

  • Agatha Sacramento Rodrigues,

    Roles Conceptualization, Data curation, Investigation, Project administration, Supervision, Writing – review & editing

    Affiliations Divisão de Clínica Obstétrica do Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo, Sao Paulo, Brazil, Departamento de Estatistica da Universidade Federal do Espirito Santo, Vitoria, Espírito Santo, Brazil

  • Rossana Pulcinelli Francisco

    Roles Conceptualization, Project administration, Supervision, Writing – review & editing

    Affiliation Departamento de Obstetricia e Ginecologia da Faculdade de Medicina da Universidade de Sao Paulo, Sao Paulo, Brazil

Abstract

Objective

To predict the actual number of COVID-19 cases in Brazilian pregnant and postpartum women diagnosed with Severe Acute Respiratory Syndrome using a predictive model created based on data from Brazilian database.

Methods

This is a cross-sectional study with pregnant and postpartum women diagnosed with Severe Acute Respiratory Syndrome (SARS) between January 2016 and November 2021 in Brazil. Patients were divided into two groups (COVID-19 and non-COVID-19) for comparative analysis, and a predictive model was constructed to classify cases without a defined causative agent.

Main outcome measures

Estimated number of COVID cases in obstetric patients with SARS and no defined agent.

Results

38,774 pregnant and postpartum women diagnosed with SARS were identified and categorized by date and causative agent. Women in the COVID-19 group (19.138) were older (29.86 ± 7.20 years), self-reported more frequently as non-white race (50.9%), and more often had educational status marked as blank or ignored (29.6% and 26.9%, respectively) compared to the Other confirmed agents’ group (2.233). The groups differed in all presented variables, and patients in the COVID-19 group were diagnosed more often in the third trimester of pregnancy or in the postpartum period. Using the XGBoost model, 13,978 cases of SARS with undefined etiology from 2020 and 2021 were reclassified: 13,799 (98.7%) as predicted COVID-19 and 179 (1.3%) as predicted non-COVID-19.

Conclusions

The number of COVID-19 cases and deaths in the obstetric population were even higher than reported by authorities, indicating a significant impact on the maternal mortality ratio during this period.

Introduction

The COVID-19 pandemic highlighted the precarious state of Brazil’s healthcare system, revealing difficulties in accessing care and inequalities in services across the different regions [1]. The Coronavirus Disease 2019 (COVID-19) is a syndrome caused by the new coronavirus (SARS-CoV-2, Severe Acute Respiratory Syndrome Coronavirus 2), an RNA virus (Ribonucleic Acid) that has been known since the 1960s and which classically affects the human respiratory tract [2] The outbreak began in the city of Wuhan, China, at the end of December 2019, and it is suspected that the first transmissions occurred from animals to humans, before transmission between humans.

On March 11, 2020, COVID-19 was characterized by the WHO as a pandemic, and, as of early March 2023, more than 600 million cases of COVID-19 and more than 6 million deaths have been confirmed worldwide [3]. Transmission occurs through direct contact (within one meter) and/or indirect contact (via fomites) with infected individuals, facilitated by saliva and respiratory secretions.

The clinical manifestation of the disease can be diverse, ranging from asymptomatic forms to a broad spectrum of respiratory or gastrointestinal symptoms, and can reach severe and potentially fatal conditions. The most frequently reported symptoms are fever, cough, fatigue, and shortness of breath, which are similar to the symptoms of other viral respiratory infections, such as the seasonal flu, and tend to appear more commonly between the fourth and fifth day of exposure. However, studies show that the incubation period of the virus can last up to 14 days. [4] Anosmia (loss of smell) and ageusia (loss of taste) have also been frequently reported symptoms in COVID-19 patients and appear to precede other symptoms.

The overall mortality of COVID-19 is variable, and can be less than 5% in patients under the age of 40 and as high as 60% in those aged between 80 and 89. Another critical factor influencing the hospital mortality rate is whether an ICU admission is required [5]. It is also important to note that since not all the people who died during the pandemic were tested for the presence of SARS-CoV-2, the actual number of deaths from COVID-19 may be much higher than reported.

The elderly, immunosuppressed patients, and those with comorbidities (systemic arterial hypertension, diabetes mellitus, and obesity) are considered to be at risk of the most severe forms of the disease. Initially, pregnant and puerperal women were not considered to be at high risk of the severe form of the disease [6]. Although the presence of severe complications associated with COVID-19 in young women was low, studies have shown that pregnant and puerperal women had a significantly higher chance of ICU admission, the need for invasive ventilation, the use of ECMO (Extracorporeal Membrane Oxygenation) and death when compared to women in the same age group who were not pregnant [7,8].

The lack of diagnostic tests and access to the hospital, especially at the beginning of the pandemic, resulted in a high rate of underreporting of COVID-19 cases and related deaths [9]. The quality of mandatory disease notification is a global issue, leading to difficulties in identifying the epidemiological reality of a given geographic area. This can hinder the planning and implementation of collective control measures [10].

The underreporting of COVID-19 deaths was a worldwide problem. The estimated notification rate (number of confirmed cases reported by the Ministry of Health (MoH) divided by the number of expected cases estimated from the number of deaths) in Brazil, up to April 2020, was 9.2% [11], suggesting that less than 10% of COVID-19 cases were diagnosed and that the notification of confirmed cases was far below other countries such as the USA, Spain or Italy, whose notification rates are higher than 15%, or South Korea and Germany, where more than 50% of cases were diagnosed and notified [12]. A study published in March 2022 indicated that COVID-19 deaths could be three times higher than the official reports. In 2020, in Brazil, the underestimated death rate caused by the coronavirus reached 18% [13].

At the onset of the pandemic, the majority of COVID-19 cases in pregnant and postpartum women were classified as mild or moderate, with only a small proportion requiring intensive care or mechanical ventilation [1416]. However, the literature has shown that the obstetric group had an increased risk of serious complications, such as the need for ICU admission, the use of mechanical ventilation, and death when compared to the group of non-pregnant women [15,1719].

It is well known that pregnant women are more susceptible to severe respiratory infections. In an epidemiological bulletin released by the Ministry of Health in May 2020, 252 pregnant women had SARS, of which 36 died, with COVID-19 as the confirmed cause of death [20], making the country the first in maternal deaths caused by infection with the new coronavirus [11]. Four months after the first case of COVID-19 infection in the country was reported, the number of maternal deaths reached 124 (12.7% of the hospitalized obstetric population) [21]. According to the Brazilian Obstetric Observatory COVID-19 (OOBr COVID-19), maternal deaths in the first five months of 2021 (a weekly average of 47.9 deaths) exceeded the number reported in 2020 (a weekly average of 12.1 deaths) [22].

In addition to the maternal immune system’s reduced effectiveness due to tolerance to antigens expressed by fetal cells [23], the greater vascularization of the nasopharyngeal epithelium, which is swollen with increased levels of progesterone, can also facilitate the pathogen’s entry. In addition, other adaptive functional and anatomical changes can contribute to making pregnant women more intolerant of hypoxia, such as [15] elevation of the diaphragm by the gravid uterus, causing a decrease in functional residual capacity of around 10–25%; increased respiratory effort caused by the stimulating effect of progesterone; increased microvascular pressure and pulmonary capillary permeability, increasing the risk of developing SARI; reduced vascular resistance and plasma colloid pressure, increasing the risk of acute pulmonary edema.

Analyzing the sociodemographic risk factors associated with adverse outcomes among obstetric cases, it has been shown that advanced maternal age (between 35–44 years) is a risk factor in this group, with pregnant women in this age group almost four times more likely to need OI and twice as likely to die when compared to non-pregnant women [7].

Variables related to social vulnerability, such as living in a peri-urban area, increased the risk of an adverse COVID-19 outcome more than threefold. The risk of death and worse clinical conditions at the time of hospital admission was higher in black pregnant and postpartum women [24]. Existing barriers, such as the difficulty in accessing prenatal and postpartum care in Brazil, have been further exacerbated during the pandemic.

Regarding the time of pregnancy (non-obstetric, pregnant, and postpartum), postpartum was associated with a higher risk of ICU admission, need for mechanical ventilation, and death when compared to pregnant women and the non-obstetric population [25].

Among the clinical characteristics that can be considered risk factors for a worse outcome among pregnant and postpartum women with COVID-19, obesity and gestational diabetes were the most commonly reported comorbidities, with obesity presenting a risk of death twice as high among infected pregnant women when compared to those who survived [26].

Obstetric adverse events such as gestational diabetes, pre-eclampsia, prematurity, low birth weight, and fetal death are also reported in the literature with a greater chance of occurring in pregnant women with COVID-19, especially in those who develop the severe forms of the disease [15,2729].

Maternal mortality rates from COVID-19 in high-income countries were lower when compared to rates in low- and middle-income countries, which were up to six times higher [30].It is important to note that maternal mortality is directly linked to the care provided during pregnancy and the puerperium. The suspension of prenatal appointments and tests during the pandemic, along with limited access to diagnostic tests and inadequate care and support services, may be directly related to the scenario presented in Brazil.

In 2021, the maternal mortality ratio in Brazil reached 107.53 deaths per 100,000 live births. When compared to previous years, the ratio was 55.31 deaths per 100,000 live births in 2019 and 71.97 deaths per 100,000 live births in 2020 [31]. According to data obtained by the Brazilian Obstetric Observatory (OOBr), the total number of maternal deaths increased by 77% between 2019 and 2021 [22]. In addition to the lack of trained professionals to care for pregnant and postpartum women, the limited number of ICU beds for the obstetric population and their unequal distribution among Brazilian regions may have contributed to the increase in maternal mortality during this period.

However, even considering the initial difficulties, with the gradual availability of tests and a greater understanding of the disease, establishing a diagnosis of COVID-19, especially in the group of pregnant and postpartum women (a highly vulnerable group), has become of great importance not only for managing cases, but also as a tool for assessing the quality of care provided to this population. The predictive model of the actual number of COVID-19 cases is not available for pregnant and postpartum women [32], and the national and worldwide impact of the pandemic on this subpopulation remains unknown.

Thus, one way to estimate the actual number of cases is to utilize machine learning (ML) tools to correct for possible underreporting by reclassifying cases without a defined etiology. This tool is part of a subfield of artificial intelligence that involves building algorithms and models capable of learning from data. Instead of being explicitly programmed to perform a task, these models are trained on a dataset to identify patterns and make predictions or decisions based on new data [33]. In the health field, supervised machine learning algorithms have already been used as a diagnostic aid in various areas [3436].

This study aimed to create a predictive model based on data from Brazilian pregnant and postpartum women who had severe acute respiratory syndrome (SARS) confirmed either by COVID-19 or other etiological agents. The model was applied to the reported SARS cases in which the etiological agent was marked as “Unspecified,” “Blank,” or “Ignored” to estimate the number of cases that were likely caused by COVID-19 in this population.

Materials and methods

This is a cross-sectional analysis with a retrospective review of data obtained from the Influenza Epidemiological Surveillance Information System (SIVEP-Gripe). The SIVEP-Gripe database contains records of hospitalized cases and deaths from SARS. In Brazil, notification is mandatory for influenza-like illness, characterized by at least two of the following symptoms: fever, chills, sore throat, headache, cough, runny nose, olfactory or taste disorders along with dyspnea or respiratory discomfort, persistent chest pressure, oxygen saturation (O2) below 95% in room air, or bluish lips or face.

We selected cases of pregnant and postpartum women aged between 10 and 55 with a diagnosis of SARS according to the SIVEP-Gripe definition, with or without a defined etiological diagnosis, from 2016 to 2021. The data were extracted on April 4, 2024, from the official website of the Ministry of Health of Brazil—DATASUS [37].

This study used R software (version 4.3.3) and the RStudio integrated development environment to process the data and perform the statistical analyses. Both the data and analyses can be accessed in the GitHub repository at https://github.com/observatorioobstetrico/covid19_vs_unspec.

Predictive model for classifying cases with undefined etiological agent as COVID-19 or non-COVID-19

Before constructing the predictive model for the presence or absence of a COVID-19 diagnosis, it was necessary to understand the etiological agents causing SARS in pregnant and postpartum women. Upon analyzing cases from 2020 and 2021, few non-COVID-19 cases with defined etiological diagnoses were found. Therefore, data from previous years were included, starting from January 1, 2016, when most variables were consistent across all selected years.

In this study, the following variables were used: age (in years); ethnicity (categories: white, non-white, blank, and ignored); education level (categories: up to elementary, high school, higher education, blank, and ignored); gestational stage (categories: first trimester, second trimester, third trimester, postpartum, and ignored); influenza vaccination, fever, cough, sore throat, dyspnea, respiratory discomfort, O2 saturation below 95%, diarrhea, heart disease, chronic lung disease, chronic kidney disease, obesity (all with categories: yes, no, blank, and ignored), and the number of SARS notifications on the patient’s admission date in a health service (created from the variable related to the date of the first symptoms). Notably, outcome variables (Intensive Care Unit (ICU) requirement and ventilatory support) were not considered, as they did not add additional information to the predictive models. Variables not present in any of the periods (e.g., anosmia, unavailable before 2020) were not considered.

Two groups were established for the predictive model

  1. -COVID-19 Group: SARS cases in pregnant and postpartum women from January 2020 until November 2021, with a confirmed COVID-19 diagnosis.
  2. -Non-COVID-19 Group: SARS cases in pregnant and postpartum women from January 2016 to November 2021, with a confirmed diagnosis of another etiological agent.

A comparative analysis between the two groups was performed using t-tests (aided by Cohen’s d statistic for effect size), Pearson’s Chi-square (aided by Cramér’s V statistic to evaluate the association’s effect size), and Fisher’s Exact test when appropriate.

Modeling and sampling

Twelve models were adjusted using the following techniques: Linear Discriminant Analysis, Quadratic Discriminant Analysis, Classification Trees, Bagging, Random Forests, K-Nearest Neighbors, Logistic Regression, Lasso Logistic Regression, RBF SVM, Linear SVM, Polynomial SVM, and XGBoost [38].

The data set was divided into training and test samples: 75% of the data was used for training and 25% of the data was used to test the models’ performance and prediction on new data. From the training sample, K-Fold cross-validation, a resampling technique that randomly divides the data into k groups of approximately equal size, was used to select the best hyperparameter configurations. Under this approach, k = 10 and 3 repetitions were considered. Additionally, the SMOTE resampling technique [39] was adopted. SMOTE creates synthetic data from the minority class using the K-Nearest Neighbors method to correct a moderate imbalance between categories (the non-COVID-19 group was about 10% of the total available data). Therefore, k = 5 and an over ratio = 0.5 were used, meaning the non-COVID-19 category was increased to 50% of the COVID-19 category.

Interpretability

The best-performing model was XGBoost, a non-interpretable model (black box) [40]. This means it is impossible to directly understand the relationship between the covariates and the response variable. Thus, interpretability measures are needed to understand the reasons influencing case classifications, either COVID-19 or non-COVID-19. One such measure is SHAP (SHapley Additive exPlanations) [41], which measures the contribution of covariates in the model using all possible combinations of the presence and absence of the covariates. SHAP can explain the model’s decisions individually and globally, and in this study, its global interpretation was considered (interpretation based on the average results of all outcomes). The contribution is considered positive when the variable increases the chance of the outcome of interest occurring (in our case, diagnosing COVID-19) and negative when the variable decreases the possibility of such an outcome.

Classification of cases with undefined etiological agent

The XGBoost model was then applied to data from 2020 to 2021 for SARS cases in pregnant and postpartum women where the etiological agent was marked as “Unspecified,” “Ignored,” or “Blank.” The COVID-19 cases rate (the number of cases per 100,000 live births) and the maternal mortality ratio (the number of maternal deaths per 100,000 live births) were also estimated, considering both diagnosed and predicted COVID-19 cases.

Results and discussion

In the present study, 38,774 cases of SARS in pregnant and postpartum women between January 2016 and November 2021 were identified. These cases were classified into two groups: SARS in pregnant and postpartum women from 2016–2019 (5,132 cases, 13·2%) and SARS in pregnant and postpartum women from 2020–2021 (33,642 cases, 86·8%). Subsequently, both groups were subdivided according to the etiological agent: SARS with a defined non-COVID-19 etiological agent, SARS with a defined COVID-19 etiological agent, and SARS with an undefined etiological agent (“Unspecified,” “Ignored,” or “Blank”). A comparative analysis was then performed between the groups of SARS confirmed by COVID-19 or Other confirmed etiological agents (2016–2021) (Fig 1).

thumbnail
Fig 1. Flowchart of case selection for the predictive model’s construction.

https://doi.org/10.1371/journal.pone.0330375.g001

The groups differed in number of SARS notifications on admission’s date, with 47·6 ± 25·89 notifications in the COVID-19 group and 9·41 ± 14·36 notifications in the non-COVID-19 group (Other confirmed agents).

Regarding to demographic data, women in the COVID-19 group were older (29·86 ± 7·20 vs. 27·45 ± 7·05 years, p < 0·001), identified more frequently as non-white (50·9% vs. 43·9%, p < 0·001), and more often had their educational level listed as blank or ignored (29·6% and 26·9%, vs. 13·5% and 24·1%, respectively, p < 0·001) compared to women in the non-COVID-19 group.

All variables evaluated showed statistical differences between groups, and patients in the COVID-19 group were diagnosed more often in the third trimester of pregnancy or in the postpartum period. Table 1 shows the comparative analysis data and the effect size for each variable.

thumbnail
Table 1. Sociodemographic Characteristics of Pregnant and Postpartum Women with SARS Confirmed by COVID-19 or Other Confirmed Agents.

https://doi.org/10.1371/journal.pone.0330375.t001

Moreover, there was a significant difference between groups concerning the distribution of symptoms, the presence of comorbidities, and the completion of these fields in the SIVEP-Gripe form (Table 2). Considering only fields answered as “Yes” and “No,” the frequency of sore throat and diarrhea was higher in the COVID-19 group. In contrast, fever, cough, and respiratory discomfort were more common in the group with other confirmed agents. The frequency of dyspnea and O2 saturation < 95% were similar in both groups. Similarly, when comparing the frequency of completed responses about the presence of comorbidities, heart disease, chronic kidney disease, and obesity were also more frequent in the COVID-19 group than in the Other confirmed agents’ group. Conversely, the frequency of chronic lung disease was higher in the non-COVID-19 group.

thumbnail
Table 2. Clinical presentation and presence of comorbidities of influenza-like illnesses in pregnant and postpartum women with SARS confirmed by COVID-19 and by Other confirmed agents.

https://doi.org/10.1371/journal.pone.0330375.t002

Among the twelve models evaluated, the XGBoost model demonstrated the highest accuracy (Table 3), highlighting its high performance in correctly predicting COVID-19 and non-COVID-19 cases. For this reason, this model was chosen to classify cases without a confirmed etiological agent. The hyperparameters of this model are shown in Table 4 – the hyperparameters considered in the other models can be accessed in S1 Table (supplementary table).

thumbnail
Table 3. Performance of the tested models to classify cases where the etiological agent was confirmed as “COVID-19” or “Other confirmed agents”.

https://doi.org/10.1371/journal.pone.0330375.t003

To interpret the proposed XGBoost model’s results globally, we used the SHAP summary plot to visualize the importance and relationships of each variable with the predicted outcomes.

In this plot, the input variables are ordered on the y-axis in descending order according to the average absolute SHAP values. Each of these variables represents an observation, and the points are distributed horizontally along the x-axis according to their respective SHAP value: when the density of the SHAP value is high, the points spread vertically. The color bar represents the raw values of the variables for each point on the plot: if a particular variable’s value is high, the point is colored in various shades of red; if the value is low, the point is colored in multiple shades of blue. The color distribution horizontally allows us to naturally understand the overall relationship of each variable with the SHAP values.

In Fig 2, the low values of SARS notifications on the patient’s admission date have negative SHAP values (points elongate to the left with increasingly blue tones), and high values of SARS notifications on the patient’s admission have positive SHAP values (points extend from zero to the right with more purplish and reddish tones). This means that pregnant and postpartum women who reported the first symptoms on a day with a high number of SARS notifications belonged to the COVID-19 group. Conversely, being in the 2nd trimester of pregnancy and being marked as “Blank” in the SIVEP-Gripe form fields for diarrhea and obesity, as well as marking “Yes” in the fields for cough and respiratory discomfort and “No” in the fields for heart disease, O2 saturation less than 95%, and Influenza vaccination, determined the prediction of SARS by other etiological agents. Age did not influence the model’s classification, as evidenced by the point cloud with an apparent mixture of colors.

thumbnail
Fig 2. SHAP Summary Plot, ranked by the mean absolute SHAP values, for the XGBoost model.

https://doi.org/10.1371/journal.pone.0330375.g002

Analyzing the summary plot of the average contribution of each variable, given by the mean of the absolute SHAP value, the result is expressed in an intuitively interpretable unit by taking the mean of the absolute SHAP value. In the plot, the contributions are represented by the values in the boxes on the left, and the variables with the highest contributions impact the model’s decisions more. The quantity of daily SARS notifications was the most influential variable, changing the predicted probability of COVID-19 diagnosis by 2·21 percentage points on average. The opposite is seen when evaluating the variables Influenza vaccination (category: “No”) and respiratory discomfort (category: “Yes”), as they were the least informative variables.

After defining the most accurate prediction model, the cases without defined etiology during the study period were revisited 13,978 cases of SARS in pregnant and postpartum women between 2020 and 2021 with undefined etiology (“Unspecified,” “Ignored,” or “Blank”) were identified. These cases were reclassified into two groups using the model previously mentioned: predicted COVID-19 (13,799; 98·7%) and predicted non-COVID-19 (179; 1·3%) (Fig 3). The sociodemographic characteristics of these groups are shown in S1 Table. These predicted cases were added to those already known, according to the etiological agent, and the number of cases in each group in the study sample was recalculated, obtaining new data on the prevalence of COVID-19 in this population.

thumbnail
Fig 3. Flowchart of patients reclassified after the XGBoost model was applied.

https://doi.org/10.1371/journal.pone.0330375.g003

Fig 4a shows the density map of officially registered COVID-19 cases across the national territory. At the same time, Fig 4b shows the density map of the COVID-19 rate when considering the sum of officially registered cases and the cases predicted by the XGBoost model. Fig 4 was created by the authors using the packages {geobr} in R software, which is under MIT license for free use [42] (Fig 4).

thumbnail
Fig 4. Density map of COVID-19 cases rate 4a Rate of cases officially registered as COVID-19 in Brazil; 4b Rate of COVID-19 cases after including the predicted cases.

https://doi.org/10.1371/journal.pone.0330375.g004

Thus, the number of SARS cases due to COVID-19 in the obstetric population was 72·1% higher (an increase of 13,799 cases) than the official reports; similarly, the number of deaths of pregnant and postpartum women due to COVID-19 increased by 19·7% (an increase of 387 deaths) compared to the officially reported numbers. The increase in case rate ranged from 11·4% to 186·2% when evaluating each of the states in Brazil, and the increase in the maternal mortality ratio ranged from 0·0% to 46·2% (S2 and S3 Tables).

It is important to note that out of the ten states with the highest rates of underreporting after the application of the prediction model, six also ranked among those with the highest increase in the maternal mortality ratio due to COVID-19 (Pernambuco, Paraiba, Esprito Santo, Alagoas, Acre, and Minas Gerais).This study constructed a predictive model for SARS etiology in pregnant and postpartum women whose notifications did not identify an etiological agent. The proposed prediction model was able to identify a 72·1% increase in cases and a 19·7% increase in maternal deaths attributed to COVID-19 in the Brazilian obstetric population from 2020 to 2021.

Accurate estimates of COVID-19 cases and deaths are a significant challenge for pandemic surveillance, especially in low- and middle-income countries, where the diagnosis is underestimated, and the disease lethality is exacerbated by a high incidence among socioeconomically disadvantaged groups and limited access to healthcare services. Socially disadvantaged populations face more difficulties in effective isolation when infected, as they reside in densely populated households with poor sanitary conditions, favoring the spread of SARS-CoV-2 [12]. A systematic review published in 2022 showed that the global number of COVID-19 deaths may have been three times higher than the official figures [43], and the region with the highest estimated underreporting of deaths caused by SARS-CoV-2 was southern Africa. When analyzed separately, the highest estimates of underreporting of deaths were seen in India, the United States, Russia, Mexico, Brazil, Indonesia and Pakistan, accounting for more than half of the number of deaths caused by COVID-19 worldwide during the period analyzed.

The research conducted in the general population by a team from the Faculty of Medicine at Universidade Federal de Minas Gerais, showed, through the analysis of medical records and exams, an 18% underreporting rate of COVID-19 deaths in Brazil, primarily due to inaccuracies in defining the underlying cause [13]. Furthermore, the death often occurred before the availability of the diagnostic test results, and a poorly defined cause or intermediate disease (garbage causes) was declared as the underlying death cause [44]. According to this research, the coronavirus identification as the probable underlying cause of death in the three capital cities (Belo Horizonte, Natal, and Salvador) analyzed was relevant, accounting for 37,163 cases (18%) only in 2020, drawing attention to the fact that the pandemic was even more severe in the country [13].

Considering that the milder cases were not prioritized for testing at the beginning of the pandemic, this fact may have contributed to this concerning rate of sub-notification in the country. The present study depicted that 5 out of 7 states with the highest rates of underreporting of COVID-19 cases were also those with the highest underreporting rates of maternal deaths, drawing attention to other factors that must be involved.

The operational difficulties in conducting diagnostic tests, a longer time frame between testing and result availability, and the alarming lack of tests during the initial phase of the pandemic may have jeopardized the correct notification of cases [45]. Despite the different rates among states, it is of utmost importance that all presented with underreporting [12]. Such disparity among reported values shows a likely difference in the policies adopted in each region.

At the pandemic’s beginning, the obstetric population was neither considered a high-risk group for developing severe COVID-19 nor a priority for diagnostic testing [15,27]. It is known that pregnant women are more susceptible to severe respiratory infections because they have a suppressed immune state and also because of the adaptive physiological changes that occur during this period, making them more intolerant to the hypoxic regime [15]. However, such a scenario changed once such a group showed a higher risk of hospitalization (337%), ICU admission (73%), and invasive ventilatory support (64%) compared to non-pregnant women [15,17,19,28,29]. Similarly, the literature demonstrated that adverse obstetric events (preeclampsia, gestational diabetes, prematurity, and fetal death) were also more prevalent in pregnant women diagnosed with COVID-19, especially those who develop severe forms of the disease. In Brazil, the Brazilian Obstetric Observatory COVID-19 (OOBr COVID-19) reported that, in the first five months of 2021, the number of deaths of pregnant and postpartum women has already exceeded the number reported throughout 2020, with a 94% increase in Maternal Mortality Ratio (MMR) [46].

During the pandemic, the country’s difficulties in accessing prenatal care and maternity services worsened. The main risk factors for severe COVID-19 in the obstetric population were related to social vulnerability, such as residing in peri-urban areas, being of black race, having lower educational levels, and the presence of comorbidities, such as obesity and gestational diabetes [24,26].

The lack of health professionals trained to offer adequate care for pregnant and postpartum women and the reduced number of ICU beds designated for this population may have also compromised the quality of care [47]. The fragility of Brazilian antenatal care became even more apparent and alarming with the increased demand for care during the pandemic.

Access to adequate care during the pandemic was crucial for the outcome of cases, showing that the lack of assistance was a determining factor for adverse outcomes in pregnant and postpartum women, leading to an increase in maternal mortality ratios during this period, especially in low- and middle-income countries [48]. Some studies showed that, in one-third of deaths among the obstetric population, patients did not have access to mechanical ventilation, and only 72% were admitted to ICU beds [30].

These weaknesses in the healthcare system may also have influenced the number of underreporting cases and, mainly, deaths from COVID-19 in pregnant and postpartum women. In Brazil, the most accurate data to assess mortality due to specific causes are obtained through the Mortality Information System (SIM), a national surveillance system developed and implemented by the Ministry of Health [49]. The availability of access to a national database (SIVEP-GRIPE), consolidated since the H1N1 pandemic in 2009, with a high number of cases of SARS in pregnant and postpartum women, provides consistency to the results obtained and to the prediction model constructed to estimate the cases and deaths that probably occurred due to COVID-19 among those for whom the etiology of SARS was not defined.

Despite the robustness of this database, a large number of “Blank” and “Unknown” responses were still observed when filling out the data on the SIVEP-GRIPE form. Thus, when creating the prediction model, it was found that the frequency of “Blank” or “Unknown” responses were higher in the COVID-19 group, although present in cases with other confirmed agents. To overcome this weakness, it was necessary to compare the groups including this variable, since failure to fill out this data could be associated with high demand, lack of training of professionals or lack of diagnostic tests, since the excess demand from the pandemic may have affected the accuracy of data entry and auditing by maternal mortality committees. Work overload for healthcare professionals, insufficient training of physicians for correct completion of death certificates, and errors in assigning the declared cause of death may have also added fragility to this system. This study was designed as a predictive model applied to the real world, that is, predicting whether a patient had COVID-19 or not based on their medical records, which commonly come with a lot of information not filled in (“Blank”) or ignored (Unknown”).

Moreover, the SIM identified a significant increase in deaths attributed to ill-defined and other natural causes in Brazil during the period, without any other explanation for the observed excess mortality. Causes such as sepsis (an increase of 30%), respiratory failure (a rise of 50%), and SARS (an increase of 381%) increased compared to data from 2017 to 2019. Conversely, deaths due to pneumonia decreased by 16%, likely because SARS was declared the cause of death instead of unspecified pneumonia in most cases [13].

This was one of the reasons why this study analyzed deaths due to SARS directly from SIVEP-Gripe rather than SIM. The mandatory notification of these cases and the clear link between death and SARS reduced potential errors in coding or underreporting due to incorrect completion of death certificates. Since these databases are public and anonymized, cross-referencing data between these two systems was not possible.

Considering the significant increase in the number of cases and deaths from COVID-19, it is important to reflect on how underreporting of these cases during the pandemic may have had negative consequences for the organization of the health system and for controlling the transmission of the virus, in addition to difficulties in providing adequate treatment, leading to an increase in serious repercussions. Such observations show the importance of knowing the real numbers to assist in implementing health policy measures, crisis coping strategies and resource planning, highlighting the need to improve data recording and continuous monitoring and the development of predictive models that can more accurately estimate the real impact of possible pandemics that may occur in the future.

Some study limitations should be disclosed. Firstly, there is a large number of “Blank” or “Unknown” responses when analyzing the data completion in the SIVEP-Gripe form. The frequency of these answers was higher in the COVID-19 group than in cases caused by other etiological agents. This variable was included in the analysis to address this limitation, as the non-completion of these data could be associated with high demand for medical care, lack of professional training, or lack of diagnostic tests. It is known that the lower availability of diagnostic tests, especially in the early stages of the pandemic, created difficulties in correctly identifying cases and deaths due to COVID-19 [13].

The study’s strong points include access to a national database consolidated since the H1N1 pandemic in 2009 and the decision to include all available variables in the model over both periods, including blank or unknown data, thus creating a realistic model.

Underreporting cases and deaths during the pandemic may have negatively affected the healthcare system organization and the virus transmission control, leading to difficulties in adequate treatment and severe repercussions. These observations underscore the importance of knowing the actual numbers of cases to assist in implementing health policy measures, crisis management strategies, and resource planning, highlighting the need for improved data recording, continuous monitoring, and the development of predictive models that can estimate more accurately the actual impacts of any future pandemics [50].

Conclusion

This study evaluated the number of underreported cases and deaths due to SARS in pregnant and postpartum women in Brazil and its states. The results indicate that, during the pandemic, the number of COVID-19 cases and deaths in the obstetric population was much higher than documented by authorities, suggesting that the disease’s impact on the maternal mortality ratio during the period was even more significant than reported.

Supporting information

S1 Table. Other hyperparameters adopted in the XGBoost model.

https://doi.org/10.1371/journal.pone.0330375.s001

(DOCX)

S2Table. Sociodemographic Characteristics, clinical presentation, and presence of comorbidities among Pregnant and Postpartum Women with SARS according to the etiological agent, either confirmed or predicted.

https://doi.org/10.1371/journal.pone.0330375.s002

(DOCX)

S3 Table. COVID-19 case rate by Brazilian sates recalculated after adding confirmed COVID-19 cases to predicted COVID-19 cases using the XGBoost prediction model.

https://doi.org/10.1371/journal.pone.0330375.s003

(DOCX)

S4 Table. COVID-19 death rate by Brazilian sates recalculated after adding confirmed COVID-19 cases to predicted COVID-19 cases using the XGBoost prediction model.

https://doi.org/10.1371/journal.pone.0330375.s004

(DOCX)

References

  1. 1. Viellas EF, Domingues RMSM, Dias MAB. Assistência pré-natal no Brasil. Cad Saude Publica. 2014;30:S85-100.
  2. 2. Histórico da pandemia de COVID-19. OPAS/OMS | Organização Pan-Americana da Saúde. https://www.paho.org/pt/covid19/historico-da-pandemia-covid-19. Accessed 2023 April 11.
  3. 3. WHO Coronavirus (COVID-19) Dashboard | WHO Coronavirus (COVID-19) Dashboard With Vaccination Data. Accessed 2023 April 11. COVID-19 Dashboard. https://covid19.who.int/ 2023.
  4. 4. Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, et al. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia. N Engl J Med. 2020;382(13):1199–207. pmid:31995857
  5. 5. Ochani RK, Kumar Ochani R, Asad A, Yasmin F, Shaikh S, Khalid H. COVID-19 pandemic: from origins to outcomes. A comprehensive review of viral pathogenesis, clinical manifestations, diagnostic evaluation, and management. Le Infezioni in Medicina. 20(2021).
  6. 6. Nakamura-Pereira M, Betina Andreucci C, de Oliveira Menezes M, Knobel R, Takemoto MLS. Worldwide maternal deaths due to COVID-19: A brief review. Int J Gynaecol Obstet. 2020;151(1):148–50. pmid:32706925
  7. 7. Zambrano LD, Ellington S, Strid P, Galang RR, Oduyebo T, Tong VT. Update: Characteristics of Symptomatic Women of Reproductive Age with Laboratory-Confirmed SARS-CoV-2 Infection by Pregnancy Status — United States, January 22–October 3, 2020. https://www.cdc.gov/coronavirus/2019-ncov/downloads/cases-updates/case-. 2020.
  8. 8. Gonçalves BMM, Franco RPV, Rodrigues AS. Maternal mortality associated with COVID-19 in Brazil in 2020 and 2021: Comparison with non-pregnant women and men. PLoS One. 2021;16(12).
  9. 9. Franco VF, Rodrigues AS, Junior ERR, de Godói LG, Monroy NAJ, da Costa RA, et al. Demographic and epidemiological characteristics of pregnant and postpartum women who died from severe acute respiratory syndrome in Brazil: A retrospective cohort study comparing COVID-19 and nonspecific etiologic causes. PLoS One. 2022;17(10):e0274797. pmid:36191001
  10. 10. Teixeira M da G, Penna GO, Risi JB, Penna ML, Alvim MF, Moraes JC de, et al. Seleção das doenças de notificação compulsória: critérios e recomendações para as três esferas de governo. Inf Epidemiol Sus. 1998;7(1).
  11. 11. Mário S. Apresentação. Ministério da Saúde Secretaria de Vigilância em Saúde.
  12. 12. Do Prado MF, De Paula Antunes BB, Dos Santos Lourenço Bastos L, Peres IT, De Araújo Batista Da Silva A, Dantas LF. Analysis of COVID-19 under-reporting in Brazil. Rev Bras Ter Intensiva. 2020;32(2):224–8.
  13. 13. França EB, Ishitani LH, de Abreu DMX, Teixeira RA, Corrêa PRL, de Jesus EDS, et al. Measuring misclassification of Covid-19 as garbage codes: Results of investigating 1,365 deaths and implications for vital statistics in Brazil. PLOS Glob Public Health. 2022;2(5):e0000199. pmid:36962159
  14. 14. Saúde MDA. Assistência à gestante e puérpera frente à pandemia de covid-19. 2021. http://bvsms.saude.gov.br/bvs/publicacoes/manual_assistencia_gestante_puerpera_covid-19_2ed.pdf
  15. 15. Jamieson DJ, Rasmussen SA. An update on COVID-19 and pregnancy. American J Obstetrics Gynecol. 2022;226:177–86.
  16. 16. Faraz S, Aftab N, Ammar A, Al Mulai I, Paulose L, Fernandes S. An Insight on The maternal-fetal outcomes of critically ill pregnant women during the second wave of COVID-19. Cureus. 2022.
  17. 17. Kadiwar S, Smith JJ, Ledot S, Johnson M, Bianchi P, Singh N, et al. Were pregnant women more affected by COVID-19 in the second wave of the pandemic?. Lancet. 2021;397(10284):1539–40. pmid:33864751
  18. 18. Baptista FS, Paganoti CF, Gomez UT, Peres SV, Malbouisson LM, Brizot M de L. Risk factors for oxygen requirement in hospitalized pregnant and postpartum women with COVID-19. Clinics. 2022;77.
  19. 19. Gomez UT, Francisco RPV, Baptista FS, Gibelli MABC, Ibidi SM, Carvalho WB de. Impact of SARS-CoV-2 on pregnancy and neonatal outcomes: An open prospective study of pregnant women in Brazil. Clinics. 2022;77.
  20. 20. Situação epidemiológica. 2023. www.saude.gov.br/coronavirus
  21. 21. Takemoto M, Menezes MO, Andreucci CB, Knobel R, Sousa L, Katz L, et al. Clinical characteristics and risk factors for mortality in obstetric patients with severe COVID-19 in Brazil: a surveillance database analysis. BJOG. 2020;127(13):1618–26. pmid:32799381
  22. 22. OOBr SRAG. https://observatorioobstetrico.shinyapps.io/covid_gesta_puerp_br/ 2023. Accessed 2023 April 13.
  23. 23. Gonzalez I, Laura M, Lea M, Pinto De Magalhães PA, Braz De Oliveira Paes L. Lourdes M, et al. Mortalidade materna por COVID-19: uma revisão sistemática da literatura maternal mortality due to COVID-19: a review systematic literature mortalidad materna por COVID-19: revisión sistemática de la literatura. Revista de Saúde Pública. 2023.
  24. 24. Menezes MO, Takemoto MLS, Nakamura-Pereira M, Katz L, Amorim MMR, Salgado HO, et al. Risk factors for adverse outcomes among pregnant and postpartum women with acute respiratory distress syndrome due to COVID-19 in Brazil. Int J Gynaecol Obstet. 2020;151(3):415–23. pmid:33011966
  25. 25. Knobel R, Takemoto MLS, Nakamura-Pereira M, Menezes MO, Borges VK, Katz L, et al. COVID-19-related deaths among women of reproductive age in Brazil: The burden of postpartum. Int J Gynaecol Obstet. 2021;155(1):101–9. pmid:34213771
  26. 26. La Verde M, Riemma G, Torella M, Cianci S, Savoia F, Licciardi F, et al. Maternal death related to COVID-19: A systematic review and meta-analysis focused on maternal co-morbidities and clinical characteristics. Int J Gynaecol Obstet. 2021;154(2):212–9. pmid:33930185
  27. 27. Moore KM, Suthar MS. Comprehensive analysis of COVID-19 during pregnancy. Biochem Biophys Res Commun. Biochem Biophys Res Commun. 2021;538:180–6.
  28. 28. Metz TD, Clifton RG, Hughes BL, Sandoval GJ, Grobman WA, Saade GR, et al. Association of SARS-CoV-2 Infection With Serious Maternal Morbidity and Mortality From Obstetric Complications. JAMA. 2022;327(8):748–59. pmid:35129581
  29. 29. Kayem G, Lecarpentier E, Deruelle P, Bretelle F, Azria E, Blanc J. A snapshot of the Covid-19 pandemic among pregnant. 2023.
  30. 30. Karimi L, Makvandi S, Vahedian-Azimi A, Sathyapalan T, Sahebkar A. Effect of COVID-19 on mortality of pregnant and postpartum women: a systematic review and meta-analysis. J Pregnancy. 2021;2021.
  31. 31. Pan American Health Organization. Histórico da pandemia de COVID-19. https://www.paho.org Accessed 2023 October 1. [Internet]. 2021.
  32. 32. Vaid S, Cakan C, Bhandari M. Using Machine Learning to Estimate Unobserved COVID-19 Infections in North America. J Bone Joint Surg Am. 2020;102(13):e70. pmid:32618918
  33. 33. Paixão GM de M, Santos BC, Araujo RM de, Ribeiro MH, Moraes JL de, Ribeiro AL. Machine Learning na Medicina: Revisão e Aplicabilidade. Arq Bras Cardiol. 2022;118(1):95–102.
  34. 34. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA. 2016;316(22):2402–10. pmid:27898976
  35. 35. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–8. pmid:28117445
  36. 36. Chen JH, Asch SM. Machine learning and prediction in medicine — beyond the peak of inflated expectations. N Engl J Med. 2017;376(26):2507–9.
  37. 37. OpenDataSUS. Accessed 2024 April 4.
  38. 38. Izbicki T, dos S. Aprendizado de máquina: uma abordagem estatística. 2020.
  39. 39. Barella VH. Técnicas para o problema de dados desbalanceados em classificação hierárquica. Universidade de Sao Paulo, Agencia USP de Gestao da Informacao Academica (AGUIA). http://dx.doi.org/10.11606/d.55.2016.tde-06012016-145045
  40. 40. Amorim WN de. Ciência de dados, poluição do ar e saúde. Universidade de Sao Paulo, Agencia USP de Gestao da Informacao Academica (AGUIA). http://dx.doi.org/10.11606/t.45.2019.tde-30052019-145057
  41. 41. Molnar C, Casalicchio G, Bischl B. Interpretable Machine Learning – A Brief History, State-of-the-Art and Challenges. 2020. p. 417–31.
  42. 42. Download official spatial data sets of Brazil – geobr. https://ipeagit.github.io/geobr/ Accessed 2023 April 13.
  43. 43. COVID-19 Excess Mortality Collaborators. Estimating excess mortality due to the COVID-19 pandemic: a systematic analysis of COVID-19-related mortality, 2020-21. Lancet. 2022;399(10334):1513–36. pmid:35279232
  44. 44. Projeto Garbage. https://capital.sp.gov.br/web/saude/w/epidemiologia_e_informacao/mortalidade/265919#:~:text=Por%20defini%C3%A7%C3%A3o%20%2CC%C3%B3digo%20Garbage Accessed 2023 May 3.
  45. 45. Orellana JDY, Cunha GM da, Marrero L, Moreira RI, Leite I da C, Horta BL. Excess deaths during the COVID-19 pandemic: underreporting and regional inequalities in Brazil. Cad Saude Publica. 2021;37(1):e00259120. pmid:33566992
  46. 46. Nações Unidas. UNFPA: mortalidade materna no Brasil aumentou 94,4% durante a pandemia. https://brasil.un.org/pt-br/203964-unfpa-mortalidade-materna-no-brasil-aumentou-944-durante-pandemia
  47. 47. Guimarães RM, Reis LGC, de Souza Mendes Gomes MA, Magluta C, de Freitas CM, Portela MC. Tracking excess of maternal deaths associated with COVID-19 in Brazil: a nationwide analysis. BMC Pregnancy Childbirth. 2023;23(1):22. pmid:36635674
  48. 48. Hantoushzadeh S, Shamshirsaz AA, Aleyasin A, Seferovic MD, Aski SK, Arian SE, et al. Maternal death due to COVID-19. Am J Obstet Gynecol. 2020;223(1):109.e1-109.e16. pmid:32360108
  49. 49. Ministério da Saúde. Manual de Procedimentos do Sistema de Informações sobre Mortalidade. https://bvsms.saude.gov.br/bvs/publicacoes/sis_mortalidade.pdf
  50. 50. De Oliveira TM, Araújo ACO. Consequências da subnotificação dos casos de COVID-19 para a saúde pública no Brasil. IAJMH. 2020;3.