Risk factors contributing to infection with SARS-CoV-2 are modulated by sex

Throughout the early stages of the COVID-19 pandemic in Mexico (August—December 2020), we closely followed a cohort of n = 100 healthcare workers. These workers were initially seronegative for Immunoglobulin G (IgG) antibodies against SARS-CoV-2, the virus that causes COVID-19, and maintained close contact with patients afflicted by the disease. We explored the database of demographic, physiological and laboratory parameters of the cohort recorded at baseline to identify potential risk factors for infection with SARS-CoV-2 at a follow-up evaluation six months later. Given that susceptibility to infection may be a systemic rather than a local property, we hypothesized that a multivariate statistical analysis, such as MANOVA, may be an appropriate statistical approach. Our results indicate that susceptibility to infection with SARS-CoV-2 is modulated by sex. For men, different physiological states appear to exist that predispose to or protect against infection, whereas for women, we did not find evidence for divergent physiological states. Intriguingly, male participants who remained uninfected throughout the six-month observation period, had values for mean arterial pressure and waist-to-hip ratio that exceeded the normative reference range. We hypothesize that certain risk factors that worsen the outcome of COVID-19 disease, such as being overweight or having high blood pressure, may instead offer some protection against infection with SARS-CoV-2.


Introduction
In a recent publication, we reported on results of a cohort of healthcare workers (HCW) from the Hospital General de Me ´xico Dr. Eduardo Liceaga (HGMEL), who were exposed during the early phase of the COVID-19 pandemic in Mexico, from August to December 2020 [1].The cohort underwent a baseline evaluation during which a variety of categorical and continuous variables were registered, including medical history, demographic, anthropomorphic, clinical and laboratory information, and working conditions.Possible infection with SARS-CoV-2 was detected at baseline, using an anti-SARS-CoV-2 Immunoglobulin G (IgG) antibody serum titers test and a quantitative polymerase chain reaction (qPCR) test.Possible infection was examined as well during the evaluation period in symptomatic participants using a qPCR test, and during a follow-up study six months later using an IgG test.The cohort included only participants who tested negative at baseline and was analyzed by groups defined post-hoc depending on the outcome of the IgG test at 6 months, using a group that remained seronegative (IgG-) and a group that became seropositive (IgG+).This previous publication had multiple study objectives.The first objective was to evaluate the incidence of SARS-CoV-2 infection in HCW of HGMEL, and to compare results from the qPCR and IgG tests.The incidence rate of SARS-CoV-2 infection was considerably higher in our cohort (58%) than in other healthcare populations as published by other authors (5-29%) [2][3][4][5][6][7][8][9].Most reports used qPCR tests to detect positive cases.Since qPCR amplifies the genetic material of the virus, this methodology can only identify active infection cases, which may lead to underestimating asymptomatic patients and incidence rates among healthcare professionals.Therefore, one of our conclusions was that the incidence of SARS-CoV-2 infection in HCW should be assessed by IgG antibody measurements irrespective of having presented COVID-19 symptoms or not.The second objective was to analyze the incidence rate of infection in HCW depending on the workplace in the hospital.We found that HCW who were correctly wearing personal protection equipment (PPE), exhibited infection rates that were similar for high-exposure (e.g., emergency room or intensive care units) and moderate-and low-exposure areas (e.g., hospital administration or medical social work).In correspondence with previous studies [10,11], we concluded that the risk of SARS-CoV-2 infection in HCW is not necessarily associated with the workplace but rather with other factors such as the inaccurate use of PPE, and community exposure or contact with family members who are asymptomatic SARS-CoV-2 carriers.A third objective was to evaluate the prevalence of symptoms related to SARS-CoV-2 infection.We found that only 33% of IgG seropositive participants exhibited symptoms, whereas in other studies asymptomatic carriers among HCW varied from 4.8 to 40% [12,13].These results indicate that the relatively high percentage of asymptomatic SARS-CoV-2 carriers among HCW may have contributed to the large transmissibility of the disease both in the hospital and in the community.Finally, the last objective of our previous publication-which is also the study subject of the present contribution-was to evaluate demographic and anthropometric characteristics and laboratory parameters measured at the baseline in the cohort as risk factors for infection with SARS-CoV-2 over a time period of six months.In line with previous studies [10,11,14], no specific risk factors could be identified between any of the categorical or continuous variables of the dataset.The only exception was the presence of hypertension at baseline in participants that would remain IgG seronegative, and absence of hypertension in participants that would become seropositive.No additional information was recorded for the hypertensive participants (e.g., the degree of hypertension), but all hypertensive participants were medicated and had blood pressure measurements in the range normal to normal-high, indicating that their hypertensive condition was under control.We concluded that it is possible that participants recognizing themselves as a vulnerable population with comorbidities, such as hypertension, followed safety procedures and social distancing more strictly than participants without known comorbidities, and therefore by adapting their behavior were less likely to become infected [1].
The purpose of the present contribution is to reanalyze the dataset of demographic and anthropometric characteristics and laboratory parameters of the cohort using more advanced statistical methods.Our previous contribution [1] used common univariate statistical analyses, such as the Student t-test for continuous variables and the χ 2 -test for categorical variables, and the conclusions on risk factors for infection with SARS-CoV-2 are counterintuitive because hypertension was defined early on during the pandemic as an important risk factor for adverse health results in the development of the COVID-19 disease.The results of our previous publication were based on a preliminary dataset of n = 110 participants that was constructed with the parameters and the evaluation results that were available at that time.Continuous variables included breathing rate (BR), temperature, weight, body mass index (BMI), waist circumference, hip circumference, blood glucose, urea, creatinine, uric acid and cholesterol.Categorical variables included obesity, type-2 diabetes and hypertension.The present contribution uses an updated dataset that includes additional parameters such as heart rate (HR), mean arterial pressure (MAP) and waist-to-hip ratio (WHR), but only considers n = 100 participants with complete data.It has been argued before that susceptibility to infection might stem from systemic and physiological factors, rather than solely from local or cellular properties [15].We hypothesized that a multivariate statistical approach, that considers interactions between different categorical variables or factors, and evaluates the combined effect of multiple continuous variables, better reflects the fact that the human body is a system where physiological variables do not exist in isolation, and therefore might detect differences in susceptibility to infection where univariate analyses do not.For this reason, the aim of the present study is to analyze the dataset of the cohort at baseline using multivariant statistical approaches, such as MANOVA, to identify risk factors for infection with SARS-CoV-2.

Trial design and ethical aspects
This prospective cohort study involved the enrollment of various professionals, including doctors, nurses, researchers, clinical laboratory technicians, psychologists, rehabilitators, and administrative staff at HGMEL, one of the largest hospitals of the Ministry of Health in Mexico City, designated for COVID-19 patient care during the pandemic.Participant enrollment occurred between August 18 and October 2, 2020, while the follow-up period of six months extended from October 2, 2020, to March 31, 2021.All participants provided a written informed consent.This study adhered to the guidelines set forth in the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement for reporting observational studies.The study received approval from the Ethics and Clinical Research committees of HGMEL (Approval No: DI/20/501/04/32) and was conducted in strict accordance with the principles outlined in the 1964 Declaration of Helsinki and its subsequent amendment in 2013.

Sample size
In our previous publication [1], we described the method for calculating the sample size, which was based on the number of HCW (medical doctors, nurses, researchers, psychologists, and rehabilitators) who continued working throughout the pandemic.Of the initial 7000 healthcare personnel working at HGMEL, 2000 of the healthcare workers with highest risk factors decided to withdraw during the pandemic, leaving a total of N = 5000 personnel who continued working, representing the total size of the active population.The determination of sample size by estimation of a proportion was estimated using the following expression [16]: where n is the required sample size, z is the critical value of the standard normal distribution associated with the desired confidence level, p is the expected estimate of the proportion of infected individuals in the population, q is the complement of p (q = 1-p), e is the allowable margin of error in the proportion estimation, N is the total size of the population.Based on previous studies [2,3], where 13% of the workers are infected with the SARS-CoV-2 virus with a margin of error of 10% and a 99% confidence level, the estimated value for the required sample size is n = 74.

IgG seroprevalence analysis
All participants underwent blood sampling at the beginning of the study (August 2020) and again six months later (January 2021) to determine the seroprevalence of the presence of antibodies against SARS-CoV-2.Specific IgG antibody levels against the SARS-CoV-2 nucleocapsid (N) protein were measured in triplicate by the Enzyme Linked-Immuno Sorbent Assay (ELISA) by a standardized kit from Abcam (Abcam, ab274339, Cambridge, UK), using a microplate reader at 450 nm.
Most participants in this study (85 out of 100, equating to 85%) received their SARS-CoV-2 vaccinations beginning in December 2020, notably after the commencement of the study, and continuing until the follow-up evaluation in January 2021.In all cases, the mRNA-based Pfizer-BioNTech COVID-19 vaccine (brand name Comirnaty) was used.ELISA differentiates between immune responses to infection with SARS-CoV-2 and immune responses caused by all vaccines against COVID-19 that were approved at that time [17].There is a consensus that vaccination against COVID-19 offers protection against severe illness, whereas the protection against the infection itself is minimal and short-lived [18,19].

Statistical analysis
MANOVA is a statistical technique used to simultaneously analyze two or more continuous dependent variables between two or more groups, and considers the interactions among independent categorical variables, provides insights into multivariate relationships that may not be evident when examining each variable separately, and enhances the understanding of the underlying relationships by considering the joint variation among variables [20,21].However, MANOVA relies on certain assumptions, such as normality of the data and absence of multicollinearity between variables, and violations of these assumptions can affect the validity of the results [22].In this study, we employ MANOVA to examine relationships between variables and group differences, aiming to capture the complex interplay among variables and obtain a comprehensive understanding of their associations.
The dataset comprises multiple categorical and continuous variables.To construct a multivariate model, we considered that many of the categorical variables could be implicitly incorporated using continuous variables; for example, diabetes can be accounted for by blood glucose, hypertension by mean arterial pressure, and being overweight or having obesity by body mass index (BMI).Variables potentially exhibiting multi-collinearity, such as weight and height, or systolic and diastolic blood pressure, were excluded, and instead combined variables such as BMI and mean arterial pressure (MAP) were used.However, the categorical variable of sex cannot be represented by any continuous variable and was explicitly included in the model.Therefore, we built a MANOVA model with two factors: Susceptibility, i.e., whether participants remain seronegative or become infected with SARS-CoV-2 at 6 months (IgG-vs.IgG+), Sex (male M vs. female F), and their interaction (Susceptibility × Sex).This results in four groups: male and female participants at baseline who remain seronegative (M-and F-), or who become infected (M+ and F+).Statistical significance was set at the p<0.05 level.
To conduct a comprehensive analysis of the differences between groups, several statistical procedures were performed.Firstly, a logarithmic transformation was applied to the non-normally distributed variables, including urea, creatinine, uric acid, blood glucose, BMI, WHR, HR, BR, body temperature, and MAP.These variables were transformed to meet the assumptions of parametric analyses.Cholesterol was the only variable that exhibited a normal distribution for all groups, and therefore did not require a logarithmic transformation.
Subsequently, a parametric multivariate analysis of variance (2-way MANOVA) was conducted to assess the main effects of the factors of Susceptibility and Sex, and the interaction Susceptibility × Sex on the combined continuous variables.Additionally, a parametric univariate analysis of variance (2-way ANOVA) was performed to study the main effects and the interaction for each of the continuous variables individually.
Furthermore, to complement the parametric analyses, the non-parametric, univariate and pairwise Mann-Whitney U test was applied.This test was used to evaluate the differences in individual variables between groups without making assumptions about the underlying distribution of the data.
All analyses were performed using Statistical Package for the Social Sciences (SPSS) software version 22.0.

Results
Originally, n = 115 healthcare workers were included; however, for this study, data from only n = 100 participants were considered due to inclusion criteria and incomplete data.shows a flowchart of the selection process.The composition of the cohort was delimited by the factors Sex and Susceptibility for infection over the course of 6 months, see Table 1.
All variables, except cholesterol, exhibited non-normal distributions in at least one of the groups.Therefore, values for these variables were reported in Table 2 in the format Q2 (Q1, Q3), where Q2 represents the second quartile (i.e., the median), and Q1 and Q3 represent the lower and upper quartiles, respectively.Parametric analyses such as MANOVA and ANOVA require data to be normalized, which can be achieved by using a logarithmic transformation.All non-normal data were transformed using a base-10 logarithmic function (log 10 ) and are reported in the format mean (SD), where SD is the standard deviation.Cholesterol, on the other hand, is reported in both formats without applying a log transform.
A 2-way MANOVA analysis was conducted to examine the main effect of factors Susceptibility, Sex and their interaction.MANOVA calculates a score that quantifies the combined effect on all continuous variables, see (Fig 2).There was no statistically significant main effect of Susceptibility on the combined dependent variables, F(11, 86) = 1.749, p = 0.076, Wilks Λ = 0.817; there was a statistically significant main effect of Sex on the combined dependent variables, F(11, 86) = 4.603, p<0.001,Wilks Λ = 0.629; there was also a statistically significant interaction effect between Susceptibility and Sex on the combined dependent variables, F(11, 86) = 1.967, p = .042,Wilks Λ = .799.

Susceptibility
To further explore possible differences of individual continuous variables between pairs of groups, a Mann-Whitney U test was performed.We compared all possible pairs of the groups M-, M+, F-and F+ for each of the continuous variables, see Table 2 and Fig  WHR (z = -3.707,p<0.001) and MAP (z = -2.627,p = 0.009) are significantly larger for Mthan for F+; values for creatinine (z = -2.532,p = 0.011) are significantly larger for M+ than for F-and values for BR (z = -2.289,p = 0.022) are significantly smaller for M+ than for F-; values for urea (z = -2.805,p = 0.005) are significantly larger for M+ than for F+.There are no statistically significant differences between F-and F+ for none of the individual dependent variables.

Discussion
The original univariate analysis of the database of baseline parameters of the cohort did not show statistically significant differences between the IgG-and IgG+ groups for any of the continuous variables, and neither for the categorical variables except for hypertension [1].In the present study, a multivariate analysis using 2-way MANOVA shows a borderline significance (p = 0.075) for the factor Susceptibility (IgG-vs.IgG+), suggesting that susceptibility is a systemic property that is better represented when considering the combined effect of all continuous variables together.The factor Sex (M vs. F) is highly significant (p<0.001),indicating that it is a dominant confounding factor that prevents the factor Susceptibility from reaching statistical significance independently.These findings confirm the widely acknowledged yet frequently overlooked reality of physiological differences between men and women concerning health-related parameters.This underscores the importance of separately analyzing groups of male and female participants [36,37].The inclusion of the interaction between both factors (M-vs.M+ vs. F-vs.F+) results in statistically significant group differences (p = 0.042) and indicates that there are systemic physiological states that are more susceptible to infection with SARS-CoV-2 than others, and that these states are modulated by sex.
A univariate 2-way ANOVA analysis shows that for the factor Susceptibility, only the variable urea is significant.For the factor Sex, half of the variables show significant differences: urea, creatinine, uric acid, waist-to-hip ratio, and mean arterial pressure.This again highlights that in the present dataset, it is inappropriate to analyze groups of male and female participants together, especially for physiological parameters where normative ranges of reference values are known to differ between the sexes.For the interaction between Susceptibility and Sex, three variables are significant: creatinine, mean arterial pressure, and waist-to-hip ratio.
To investigate which individual groups significantly deviate from other groups, we conducted a separate non-parametric pairwise univariate statistical test using the Mann-Whitney U test.There are no statistically significant differences between the groups of female participants (F-vs.F+) for any of the variables, and all values tend to be within the normative reference range.In contrast, there were statistically significant differences between the groups of male participants (M-vs.M+) for uric acid (p = 0.032), waist-to-hip ratio (p<0.001), and mean arterial pressure (p = 0.012).Uric acid is within the reference range for both groups, but the waist-to-hip ratio and mean arterial pressure variables are above the reference range for M-and within the range for M+.
One of the main findings of this study is the possible existence of different physiological states that are associated with varying degrees of susceptibility to infection with SARS-CoV-2.We recently published a series of contributions on physiological networks, where the central idea is that the human body is a complex system and that demographic, anthropomorphic, clinical, and laboratory variables do not exist in isolation but correlate into systemic physiological states that define the overall health condition [36,[38][39][40][41][42].These physiological networks are influenced by sex [39,42] and are disrupted by COVID-19 [39,42].
The other main finding is that susceptibility to infection with SARS-CoV-2 is modulated by sex.While there were no differences in the physiological state and individual variables between the two groups of female participants (F-and F+), significant differences were observed between the groups of male participants, which were either protected against (M-) or predisposed to (M+) infection.The individual variables that showed significant differences, uric acid, creatinine, waist-to-hip ratio, and mean arterial pressure, were consistently higher in Mcompared to M+.All individual continuous variables were within the normal range for M+, but waist-to-hip ratio and mean arterial pressure were above the normal range for M-.These results are counter-intuitive because being overweight and having high blood pressure are known risk factors for the aggravated progression of COVID-19.Therefore, it appears there may be a specific set of risk factors for infection with SARS-CoV-2, and a different set of risk factors for an exacerbated progression of the COVID-19 disease.The present results show that the male sex exhibits specific risk factors for infection with SARS-CoV-2.Other studies have indicated that female COVID-19 patients have an increased risk to develop long COVID [43][44][45], whereas male patients have higher odds of requiring intensive treatment unit (ITU) admission or of having a fatal outcome [46].Previously, we have explained the distinct evolution of COVID-19 for both sexes in terms of differences in the structure of their corresponding physiological networks [40,42].
Our previous publication discussed susceptibility to infection with SARS-CoV-2 from the perspective of behavior.In particular, it was hypothesized that study participants who identified themselves as being vulnerable adapted their behavior to minimize exposure to the virus [1].The present contribution raises the possibility that the behavior of the study participants may have been modulated by sex.An increasing body of literature emphasizes that distinct response patterns to the virus are evident between men and women, particularly manifesting in variations within self-care behaviors, and the greater compliance of women with preventive health behaviors has been contributed to personality traits of agreeableness and conscientiousness more typical of the female sex [47].Such behavioral adaptations make up an important part of the non-pharmaceutical interventions to lower the viral inoculum and to reduce susceptibility to infection by SARS-CoV-2 [48].Considering the higher adherence of women to preventive health behaviors, female participants within our cohort might have adjusted their behavior regardless of their perceived vulnerability, whereas for men, factors such as vulnerability and/or the presence of comorbidities might have played a more decisive role.An alternative interpretation could be differences in genetic susceptibility between the study groups, where variations in specific alleles can increase susceptibility or resistance, on the one hand to infection with SARS-CoV-2 [49,50] and on the other hand to adverse health outcomes in COVID-19 disease [51][52][53][54].In this context also, the results of the present contribution suggest that genetic susceptibility may be modulated by sex.However, the focus of the present contribution is physiological.Sex dimorphism affects susceptibility, severity, and progression of infections and disease [55].Studies suggest that the inflammatory response may be different between men and women.It has been observed that in metabolic diseases, inflammatory mediators can be chronically or mildly activated [56], which could maintain a state of alertness of the innate immune system in a first line of defense against invasive pathogens.Whereas behavioral adaptations may have played a role in minimizing the exposure of the seronegative male participants of our cohort, it is also possible that the above-normal values for mean arterial pressure and waist-to-hip ratio that we observed in this group, and an associated state of lowgrade inflammation, have resulted in a decreased susceptibility to infection.Strengths and limitations.A possible limitation is the relatively small size of the study population.The calculation of the sample size indicated that the complete study population was sufficient for statistical analysis, but the size of the individual subgroups may be too restrictive to adequately reflect the more than seven thousand HCW employed by the HGMEL.A potential strength lies in the considerable homogeneity of our study population, with the vast majority vaccinated against COVID-19 using the same vaccine.Additionally, possible virus infections were identified through two distinct methodologies: qPCR and IgG detection.We detected that an important confounding factor of our previous publication [1] was sex, which was resolved in the present contribution and which led to new results.We do not identify any other confounding factors as we are confident that possible infection and symptoms were correctly detected, without underreporting, and that all demographic, clinical and laboratory parameters were correctly registered.

Conclusion
We examined the baseline data of healthcare workers to identify SARS-CoV-2 infection risk factors.We used multivariate statistics, such as MANOVA, to consider physiological factors.We found sex modulates susceptibility to infection.Men have different physiological states affecting their risk.In women, we did not find such distinctions.Men with normal arterial pressure and waist-to-hip ratio resulted to be more at risk.Being overweight and having hypertension worsen the development of COVID-19 disease, but may protect against initial SARS--CoV-2 infection.

Fig 2 .
Fig 2. Box-whisker charts of MANOVA scores, for (a) the factor Susceptibility (p>0.05),(b) the factor Sex (p<0.001) and (c) the interaction Susceptibility × Sex (p = 0.033).Baseline values are shown for male and female participants who remained seronegative (M-and F-), or who were found to have been infected (M+ and F+), during a follow-up study six months later.Outliers are shown explicitly (black dots).https://doi.org/10.1371/journal.pone.0297901.g002

3 .Fig 3 .
Fig 3. Box-whisker charts of values of the continuous variables that show significant pairwise differences between groups using the Mann-Whitney U test, (a) waist-to-hip ratio (WHR), (b) mean arterial pressure (MAP), (c) creatinine, (d) uric acid, (e) urea, and (f) breathing rate (BR).Baseline values are shown for male and female participants who remained seronegative (M-and F-), or who were found to have been infected (M+ and F+), during a follow-up study six months later.Normative ranges of reference values are indicated for male (blue-shaded background) and female participants (pink-shaded background).Outliers are shown explicitly (black dots).https://doi.org/10.1371/journal.pone.0297901.g003

Table 2 . Baseline values for anthropometric, biochemical, and physiological variables of the groups of male and female participants that remain seronegative (M- and F-, respectively) and become seropositive over the course of 6 months (M+ and F+, respectively).
All variables, except cholesterol, are non-normal for at least one of the groups, and therefore are reported in standard units as Q2(Q1, Q3) (upper half of the table), where Q2 is second quartile, i.e., the median, Q1 y Q3 are first and third quartiles respectively.Non-normal data is log transformed and represented as mean (SD) (bottom half of the table), where SD is standard deviation.For completeness, cholesterol is given in both formats without log transformation.Statistically significant factors and/or interaction using 2-way ANOVA and statistically significant pairwise group comparisons using the Mann-Whitney U test are also indicated for individual variables. https://doi.org/10.1371/journal.pone.0297901.t001