Prediction of Maternal Cytomegalovirus Serostatus in Early Pregnancy: A Retrospective Analysis in Western Europe

Background Cytomegalovirus (CMV) is the most prevalent congenital viral infection and thus places an enormous disease burden on newborn infants. Seroprevalence of maternal antibodies to CMV due to CMV exposure prior to pregnancy is currently the most important protective factor against congenital CMV disease. The aim of this study was to identify potential predictors, and to develop and evaluate a risk-predicting model for the maternal CMV serostatus in early pregnancy. Methods Maternal and paternal background information, as well as maternal CMV serostatus in early pregnancy from 882 pregnant women were analyzed. Women were divided into two groups based on their CMV serostatus, and were compared using univariate analysis. To predict serostatus based on epidemiological baseline characteristics, a multiple logistic regression model was calculated using stepwise model selection. Sensitivity and specificity were analyzed using ROC curves. A nomogram based on the model was developed. Results 646 women were CMV seropositive (73.2%), and 236 were seronegative (26.8%). The groups differed significantly with respect to maternal age (p = 0.006), gravidity (p<0.001), parity (p<0.001), use of assisted reproduction techniques (p = 0.018), maternal and paternal migration background (p<0.001), and maternal and paternal education level (p<0.001). ROC evaluation of the selected prediction model revealed an area under the curve of 0.83 (95%CI: 0.8–0.86), yielding sensitivity and specificity values of 0.69 and 0.86, respectively. Conclusion We identified predictors of maternal CMV serostatus in early pregnancy and developed a risk-predicting model based on baseline epidemiological characteristics. Our findings provide easy accessible information that can influence the counseling of pregnant woman in terms of their CMV-associated risk.


Introduction
Cytomegalovirus (CMV) is a member of the betaherpesvirinae subfamily of herpes viruses. CMV is highly ubiquitous among humans and can cause a wide variety of clinical manifestations; CMV can establish life-long latency or persistence following primary infection. [1] Primary CMV infection in pregnant women can cause mild febrile illness, as well as other nonspecific symptoms; however, CMV infection is clinically asymptomatic in 90% of cases. Nonprimary infection is defined as infection with a different strain of CMV or reactivation of latent virus with pre-existing antibodies; non-primary infection generally does not cause maternal symptoms. [2] Among women of reproductive age, the prevalence of antibodies in the serum (i.e., seropositivity) due to prior CMV exposure ranges from 45% in developed countries to 100% in developing countries and is associated with several epidemiological factors, including age, gravidity, parity, place of birth, and socioeconomic status. [3][4][5][6] Because seroprevalence rates can reflect the size of the virus reservoir, maternal serostatus can have an impact on the incidence of congenital CMV infection. [7] CMV is the most prevalent congenital viral infection, affecting 0.64% of newborn infants. [8,9] However, this prevalence varies widely among study populations; For example, in Europe, a highly industrialized region with relatively low overall maternal CMV seroprevalence, regional CMV seroprevalence ranges from as low as 0.1% to as high as 2%. [8] In developing countries with high rates of maternal CMV seropositivity, even higher rates (1-5.4%) of congenital CMV prevalence have been reported. [10,11] CMV can be transmitted vertically through intrauterine infection, peripartal transmission, cervicovaginal secretions during vaginal delivery, or breastfeeding. Because cytotrophoblasts in the placenta are permissive to CMV replication, the most common route of vertical transmission is infection of the placenta and subsequent transmission to the fetus, where the virus can infect multiple tissues. [12] Clinical congenital disease can include features such as small for gestational age, microcephaly, ventriculomegaly, chorioretinitis, hepatitis, splenomegaly, thrombocytopenia, and petechiae; newborns with this disease have a mortality rate of approximately 5%. [13] Moreover, approximately 50% of survivors develop severe long-term neurological deficits, including progressive hearing loss and/or cognitive impairment. [14] Most cases of symptomatic congenital disease are caused by primary maternal CMV infection. [15] The rate of fetal infection ranges from 33% to 75%, and the prevalence of disease can reach 50% among primary infections that occur within the first half of pregnancy; In contrast, non-primary infections are transmitted intrauterine in only approximately 1% of cases, and more than 90% of infected infants are healthy. [16,17] However, a growing body of evidence suggests that non-primary infections may also constitute a significant cause of severe congenital CMV disease and can contribute significantly to the global disease burden associated with congenital CMV exposure. [7,[18][19][20] Nevertheless, the presence of maternal antibodies to CMV due to CMV exposure prior to pregnancy is the most important protective factor against congenital CMV disease. [4] Worldwide, seropositivity rates are lowest in Western Europe and the United States of America. [21] This low rate is associated with a higher risk of primary CMV infection during pregnancy.
Despite the extraordinary disease burden that primary CMV infection during pregnancy places on the newborn infant, routine serological screening of pregnant women for CMV is not currently recommended. [22] Nevertheless, several options are available to prevent fetal CMV infection during pregnancy. Confirming CMV seronegativity and educating women during pregnancy can help modify maternal behavior and can decrease the rate of seroconversion in pregnant women who are at risk of infection. [23][24][25] Hyperimmunoglobulin therapy in pregnant women with primary CMV infection is an interesting-albeit experimental-approach that may reduce the rate of congenital infections in selected cases, even though a recent randomized, placebo-controlled trial failed to confirm the initially promising results. [17,26,27] Identifying women who are at risk for seronegativity during pregnancy and understanding the resulting risk of subsequent primary CMV infection during pregnancy are crucial steps towards preventing congenital CMV infection. The aim of this study was to identify potential predictors, and to develop and evaluate a risk-predicting model for the maternal CMV serostatus in early pregnancy.

Participants
From December 2009 through April 2013, pregnant women who were receiving routine prenatal care at the Department of Obstetrics and Fetomaternal Medicine at the Medical University of Vienna were invited to participate in Biotest study 963, a randomized, open, controlled, prospective, multicenter and multinational study (study title: "Prevention of Congenital Cytomegalovirus Infection in Infants of Mothers with Primary Cytomegalovirus Infection during Pregnancy"). The inclusion criteria were gestational age 13 weeks + six days and maternal age 18-45 years. At the initial screening visit of the study, CMV-specific antibodies were measured, and the results were documented in the patient's medical file. All participating subjects provided written informed consent.
For the present study, we performed a retrospective chart analysis of all patients who were eligible for the Biotest study at our study site. Data regarding maternal CMV serostatus, demographics, maternal and paternal migration background, and educational level were extracted from the medical files. Since November 2010, this parental background information has been collected routinely during medical interviews conducted at the first prenatal visit at our department. Women who were seropositive for CMV IgM antibodies and women for whom more than one variable was missing were excluded from the analysis. This study was approved by the institutional review board of the Medical University of Vienna (Reference number 1704_2013).

Definition of terms
Seropositivity was defined as the presence of CMV-specific IgG antibodies in the maternal serum; the presence of this antibody serves as a marker for whether the woman has ever been infected with CMV. Seronegativity was defined as the absence of CMV-specific IgG antibodies. Seroprevalence was defined as the prevalence of CMV seropositivity within a defined population. Gravidity refers to the total number of previous pregnancies; Parity refers to the number of viable previous pregnancies. Maternal and paternal educational status (ES) was classified as the completion of no education, primary education, or lower secondary education (ES 1); upper secondary education (ES 2); post-secondary non-tertiary education (ES 3); or tertiary education (ES 4). Migration background (MB) was used to define an individual who was born outside of Western Europe and is currently living in (i.e., emigrated to) Western Europe. The various migrational regions are illustrated in Fig 1. Because a generally accepted definition of "Western Europe" is not available, we divided Europe into Western Europe and Eastern Europe based on the former Iron Curtain, which was a political and physical boundary that divided Europe into two separate regions until 1991, including emigration restrictions.

Statistics
All analyses were performed using the statistical software package R, version 3.1.
Maternal age is reported as mean (+/-standard deviation); all other variables were categorical and are reported as absolute or relative frequencies. For comparisons of distributions between groups, the Student's t-test was used for maternal age; the chi-square test was used for all other variables. P-values 0.05 were considered statistically significant. To describe the correlation between parameters, Spearman's rank correlation coefficient was calculated. To predict serostatus, a multiple logistic regression model was generated by stepwise forwardbackward model selection starting from a null-model and using the Akaike information criterion as the selection criterion. [28] The scope of possible predictor variables included maternal age, gravidity, parity, use of assisted reproduction techniques (ART), ES, and MB, as well as paternal ES and MB. In addition, all pairwise interactions were permitted to be selected in the model. The model's sensitivity and specificity for various cut-off values were calculated using receiver operating characteristic (ROC) curves. To assess the predictive quality of the selected model in a new data set, five-fold cross-validation was performed. The data set was divided randomly into five subsets of approximately equal sample size. For cross-validation, the model coefficients were estimated from the five possible estimation data sets, each of which contained four out of the five partial data subsets. The resulting models were then used to predict serostatus in the respective prediction data set that was not used for the estimation. For each prediction data set, an ROC curve was calculated, and the area under each ROC curve (AUC) was compared to the AUC of the original model.
This model allows us to predict maternal CMV serostatus using the set of selected variables. A score is calculated as the sum of the regression coefficients that match a woman's observed variable values, age times the regression coefficient for age and the model intercept. This score is transformed to a predicted probability using the inverse logit link function. Thus, the following equation was used to calculate the probability of seropositivity (Prob): To facilitate calculation of a predicted probability for seropositivity using the logistic regression model, a nomogram was developed. In the nomogram, probability of CMV seropositivity can be determined by reading points for each variable from the matching lower scale, summing the points, and identifying the prediction of seropositivity associated with the total points line.
Note that the nomogram sum score is 1.11 larger than the score directly derived from the model coefficients as the intercept is accounted for implicetely in the nomogram.

Descriptive statistics
From a total of 998 women who were screened for the Biotest study since November 2010, the complete data set (or only one variable missing) was available for 882 women; these 882 patients were included in our final analysis. The characteristics of these patients are summarized in Table 1. The participants were then divided into two groups based on their serostatus; 646 women were seropositive (73.2%), and 236 were seronegative (26.8%). Based on bivariate analyses, these two groups differed significantly with respect to maternal age, gravidity, parity, the use of assisted reproduction techniques, maternal and paternal migration background, and maternal and paternal education level (Table 1) In general, the rate of seropositivity increased with increasing gravidity. The correlation between gravidity and parity was high (r = 0.80). Women who were born in Western Europe had significantly lower seroprevalence than women who were born outside of Western Europe (p<0.001). Because no difference was observed with respect to seroprevalence between different regions outside of Western Europe, we collapsed the information regarding origin into a binary variable with values MB (for participants who migrated to Western Europe) or no MB (for participants who were born in Western Europe).
Similarly to maternal MB, paternal MB was also associated with seroprevalence (p<0.001). Seroprevalence was highest (96%) among women with both maternal and paternal MB. The various countries of origin and the number of included patients from each country are summarized in S1 Fig.
Low maternal and/or paternal ES was associated with higher seroprevalence (p<0.001); moreover, maternal ES and paternal ES were significantly correlated (r = 0.63). The use of ART was inversely correlated with seropositivity. We also found an inverse correlation between maternal age and seropositivity. This latter finding appears to be contradictory to our finding that increasing gravidity is associated with higher seroprevalence; we therefore compared women with MB (n = 449; 50.9%) with women without MB (n = 433; 49.1%). The results of these two subgroups are summarized in Table 2.
The women with MB had a seroprevalence rate of 93% compared to 53% in the women without MB. No significant difference was found between the two groups with respect to maternal age (p = 0.850, 95%CI for the difference of means: -0.80-0.66). However, both gravidity and parity were significantly higher among the women with MB compared to the women without MB. Moreover, both maternal and paternal ES were significantly lower in the patients with MB (p<0.001). Furthermore, the use of ART was less common among the patients with MB (p = 0.040)

Prediction model
To generate a prediction model, a multiple logistic regression analysis was performed in order to identify variables that were predictive of maternal CMV serostatus. Our procedure yielded a logistic regression model for serostatus that includes the following variables: maternal age, parity, maternal ES, maternal MB, paternal MB, and the interaction between maternal MB and paternal MB. To describe the association between MB and serostatus, we set MB as a categorical variable with four stages corresponding to the four possible combinations of maternal and paternal MB. The results of the parameterization of the selected prediction model are summarized in Table 3.
The ROC curve for the model based on the complete data set is shown in Fig 2. The highest sum of sensitivity and specificity is obtained when predicting a seropositive status for calculated probabilities above a cut-off of 0.74, with sensitivity and specificity values of 0.69 and 0.86, respectively, for predicting seropositivity. The AUC is 0.83 (95% CI: 0.8-0.86).
In addition, we performed five cross-validation rounds within the data set. The respective AUCs are summarized in Table 4. The ROC curves obtained (S2 Fig) and the AUC values suggest that the predictive capacity of the model is stable when used to test a new data set from a similar population.
The resulting nomogram is shown in Fig 3.

Discussion
Our aim was to identify potential predictors of maternal CMV serostatus in early pregnancy and to create and evaluate a model for predicting CMV serostatus. Knowledge about predictors of maternal CMV serostatus can help to assess a women's individual CMV-associated risk in pregnancy, influence the counseling of pregnant women, and contribute to efforts to avoid congenital CMV infections. In this study we found that i) serostatus is significantly correlated with maternal age, gravidity, parity, and education level; ii) maternal MB is associated with higher CMV seroprevalence; and iii) paternal MB and ES are correlated with maternal  CMV serostatus. Based on these findings, we generated a model for predicting maternal CMV serostatus; in addition, we generated a nomogram that may provide more structured information that can be used to counsel pregnant women regarding the risks associated with CMV seronegativity. The presence of CMV-specific IgG antibodies in the serum-which indicates a prior or current CMV infection-has been correlated previously with several epidemiological factors, including age, gravidity, parity, and socioeconomic status. [3,4,6,21] Most of these previous findings are consistent with our own findings. However, we found an inverse correlation between maternal age and seropositivity, which is in contrast to other published findings. [4] There are several possible explanations for this discrepancy. Although maternal age was the same in women with MB and women with non-MB, gravidity and parity were higher in the women with MB. CMV can be transmitted via several routes, including adult-to-child, childto-adult, and adult-to-adult. Most children in developing countries are infected with CMV by the age of three years. [29] Therefore the risk to acquire a CMV infection increases with the number of children living in a family and depends on the child's care situation. [30,31] In addition, both maternal and paternal ES were lower in the patients with MB. Lower ES is associated with lower financial income, which results in narrow living space and possibly lower standards of hygiene. This further increases the chance of CMV transmission between mother and children. Lastly, women with MB had more often partners with MB compared to women without MB. Reports of CMV in semen, saliva, and cervical secretions suggest that transmission can occur during sexual activity, and sexual activity can affect CMV seroprevalence among women of childbearing age. [32,33] Therefore, a woman's CMV serostatus can be influenced by her partner's behavior and/or characteristics.
In light of these findings, in our cohort the socioeconomic situation appears to supersede the effect of maternal age on CMV serostatus in early pregnancy. The majority of women born within Western Europe in recent decades have high socioeconomic status, and the average age at their first pregnancy is increasing. We therefore hypothesize that improved socioeconomic conditions in Western Europe have led to a decreased rate of CMV seropositivity, thereby increasing the risk of primary CMV infection during pregnancy.
The Centers for Disease Control and Prevention (CDC) in Atlanta, GA recommend: i) against routine serological screening for CMV, ii) that women consult their doctor regarding the risk of CMV infection during pregnancy, and iii) that pregnant women receive counseling regarding simple hygiene precautions to prevent CMV infection. [22] This approach advocated by the CDC can clearly cause a dilemma for both consulting healthcare professionals and pregnant women-on the one hand, the individual risk of primary CMV infection is difficult to assess in developed countries; on the other hand, the success of preventive measures depends on the woman's motivation to follow these hygiene recommendations. Given this dilemma, identifying women who are at risk for seronegativity is crucial. Therefore, we attempted to Prediction of Maternal CMV Serostatus in Early Pregnancy develop a risk-predicting model for maternal CMV serostatus in early pregnancy. Our logistic regression model includes maternal age, parity, ES, maternal and paternal MB, and the interaction between maternal and paternal MB. Because gravidity is correlated with parity, and because paternal ES is correlated with maternal ES, it is not surprising that these variables were not selected in the final model. With sensitivity and specificity values of 69% and 86%, respectively, our method provides a high probability of accurately predicting CMV serostatus in early pregnancy. Although easy accessible to healthcare providers, this information can have a strong influence on the counseling given to pregnant woman in terms of CMV-associated risk. Our study has several limitations that bear mentioning. First, the women included in the study reflect a group of patients derived from a single tertiary care centre located in a European capital city. Nevertheless, the observed prevalence of CMV seropositivity among the Western European women in our study population is consistent with other published reports. [21,34] In addition, the usefulness of the presented prediction model appears to be geographically restricted. Although our cross-validation results suggest that the predictive capacity of our prediction model and nomogram is stable, external validation with comparable populations is needed before our model can be extrapolated for use on a broader scale. A strength of our study is that we analyzed women who were included in the screening phase of a prospective, randomized trial, thus reducing the risk of selection bias that often occurs with retrospective chart reviews. Moreover, the wide range of evaluated predictors including paternal variables is an additional strength of our study.
To conclude, identifying women who are at risk for CMV seronegativity in early pregnancy is important in order to avoid congenital CMV infection. We ascertained predictors of maternal CMV serostatus in early pregnancy and developed a risk-predicting model based on baseline epidemiological characteristics. Our findings provide easily accessible information that can strongly influence the counselling given to pregnant woman in terms of their CMV-associated risk.