Development of a tool for prediction of ovarian cancer in patients with adnexal masses: Value of plasma fibrinogen

Objective To develop a tool for individualized risk estimation of presence of cancer in women with adnexal masses, and to assess the added value of plasma fibrinogen. Study design We performed a retrospective analysis of a prospectively maintained database of 906 patients with adnexal masses who underwent cystectomy or oophorectomy. Uni- and multivariate logistic regression analyses including pre-operative plasma fibrinogen levels and established predictors were performed. A nomogram was generated to predict the probability of ovarian cancer. Internal validation with split-sample analysis was performed. Decision curve analysis (DCA) was then used to evaluate the clinical net benefit of the prediction model. Results Ovarian cancer including borderline tumours was found in 241 (26.6%) patients. In multivariate analysis, elevated plasma fibrinogen, elevated CA-125, suspicion for malignancy on ultrasound, and postmenopausal status were associated with ovarian cancer and formed the basis for the nomogram. The overall predictive accuracy of the model, as measured by AUC, was 0.91 (95% CI 0.87–0.94). DCA revealed a net benefit for using this model for predicting ovarian cancer presence compared to a strategy of treat all or treat none. Conclusion We confirmed the value of plasma fibrinogen as a strong predictor for ovarian cancer in a large cohort of patients with adnexal masses. We developed a highly accurate multivariable model to help in the clinical decision-making regarding the presence of ovarian cancer. This model provided net benefit for a wide range of threshold probabilities. External validation is needed before a recommendation for its use in routine practice can be given.


Introduction
In western countries, epithelial ovarian cancer (EOC) is the second most common cancer of the female reproductive system and accountable for as many deaths as all other gynaecological cancers combined [1]. Early detection and treatment in high-volume centres by gynaecologic oncology surgeons is known to improve survival outcome [2,3]. While adnexal masses are common sonographic findings in women of all ages, approximately 75% of tumours are found to be benign [4,5]. Identifying those with a high risk of having ovarian cancer often poses a daily clinical challenge. Especially in young women, ovarian cancer is rare balancing the wish for preservation of fertility [6]. It is therefore of major importance to accurately differentiate between benign and malignant adnexal masses in order to avoid unnecessary surgical procedures and to deliver optimal care to those who are likely to harbour an ovarian cancer.
Prediction models have been generated to help with these risk-stratifications [7,8]. However, none has received wide spread acceptance and use in routine clinical practice. Ultrasound features suggesting malignancy, such as ascites, increased vascularization, solid components, tumour size, papillary projections, and irregular cyst walls, are the basis for ultrasound based algorithms (e.g. the International Ovarian Tumour Analysis (IOTA) models) [4,[9][10]. Biomarkers, such as the tumour marker CA-125, have been validated and incorporated into such algorithms [6,[11][12][13]. Unfortunately, serum CA-125 levels are frequently normal in borderline ovarian tumours as well as early stage invasive ovarian cancer, and can be falsely elevated in benign conditions, particularly in premenopausal women [14][15][16]. To complement the effect of CA-125, models combining different biomarkers have been developed. The Risk of Ovarian Malignancy Algorithm (ROMA), incorporating measurements of CA-125 and the human epididymal secretory protein 4 (HE-4) performed particularly well in premenopausal women in a prospective, multicentre trial, leading to it's approval by the U.S. Food and Drug Administration (FDA) [17]. However, other evaluations of the ROMA have reported conflicting results, doubting the beneficial effect of adding HE-4 to CA-125 [18]. Another biomarker-based diagnostic test, approved by the FDA, utilizes a five biomarker-combination and is available under the trade name OVA1 (Vermillion, Inc). Again, data on the test's ability to outperform CA-125 measurements alone are conflicting [13,19].
Fibrinogen is a key protein in the coagulation pathway and one of the major acute phase proteins, strongly linked to inflammation and stress. It has been found have prognostic value for various malignant tumours including ovarian cancer [20][21][22][23][24][25][26][27]. We have previously reported that pre-operative plasma fibrinogen is independently associated with ovarian cancer presence in patients with adnexal masses [25]. This study was limited by the failure to adjust for the effects of ultrasound results and its statistical design. For a biomarker to change clinical decision-making, it needs to improve the predictive accuracy for an event beyond that achieved by a multivariable analysis and add a net benefit in decision-analysis across a range of probabilities considered as relevant [28][29][30].
The aim of the present study was to validate these findings in a larger cohort, to find an optimal cut-off value for fibrinogen and to develop a nomogram as risk-prediction model for malignant ovarian tumours in patients with adnexal masses of unknown dignity. Patients and clinical management   Overall, 3,234 patients underwent surgery for suspected adnexal masses at a single tertiary care  centre between 2000 and 2012. Of these, pre-operative plasma fibrinogen levels were available in 1,754 patients. After exclusion of patients with missing pre-operative CA-125 and on preoperative ultrasound details, 906 patients remained eligible for inclusion into the present study. Patients with inflammatory processes, pregnancy related adnexal masses, and those with non-epithelial pathology on final histological result were previously excluded. Clinical and pathological data were retrospectively extracted from a prospectively maintained database. The study was approved by the institutional review board (Project # 1062/2015). The patient data was de-identified and handled in accordance with ethical standards of good scientific practice.

Materials and methods
Patients were treated, as previously described [25], by laparoscopic or open ovarian cystectomy or salpingo-oophorectomy. In patients with EOC or borderline tumour of the ovary (BOT), surgery was extended to ensure adequate staging and, when necessary, complete resection of all visible tumour. If primary cytoreductive surgery was not feasible, patients with EOC were treated with neoadjuvant chemotherapy and interval debulking surgery.
Patients' assessment prior to surgery included blood tests, a physical examination, a transvaginal ultrasound, and, in case of suspected malignancy, further imaging, such as computed tomography (CT) and / or magnetic resonance imaging (MRI). Postmenopausal status was defined as > 1 year of amenorrhoea or > 50 years of age in the case of previous hysterectomy. If criteria were present suspecting malignancy on ultrasound examination, as suggested by the individual examiner, ultrasound was classified as "presence of malignancy criteria (M-criteria) on ultrasound". For CA-125 we used the established threshold of >35.0 U/mL [8].

Fibrinogen measurements
Plasma fibrinogen levels were determined by the Clauss method [31] using clotting reagents. According to the manufacturer, the intra-assay variability was 3.5%. Plasma fibrinogen levels between 180 and 400 mg/dL were defined as normal by our laboratory. Yet, we aimed to determine an optimal cut-off for prediction of ovarian cancer within the cohort of the present study as described below.

Statistics
Categorical variables are presented as numbers and proportions, continuous variables as medians (interquartile range [IQR]). Group differences in categorical and continuous variables were analysed using chi-square, Kruskal-Wallis and Mann-Whitney U tests. The optimal cutoff value for plasma fibrinogen was estimated by receiver operating characteristic (ROC) curve analysis. Potential predictors of ovarian cancer (OC, including both BOT and EOC) were analysed by uni-and multivariate logistic regression, including menopausal status, the presence of M-criteria on ultrasound, serum CA 125 levels > 35 U/mL, and elevated plasma fibrinogen according to the determined optimal cut-off value. Estimates are given as odd ratios (OR) and 95% confidence interval. The multivariate model combined all significant variables of the univariate analysis.
The regression coefficients of all significant variables in multivariate analysis were used to generate a nomogram for calculating the patient-specific probabilities of OC. A split-sample internal validation procedure was performed to establish that the model worked sufficiently among patients other than those whose data generated the model. Therefore, the entire cohort was randomly divided into two sub-cohorts with equal size and distribution of variables forming the test and the validation sample.
Values for each of the model's covariates were mapped to points on a scale ranging from 0-10, with total points obtained for each model's covariate mapped to the probability of OC associated with that combination of covariate values [32]. The predictive accuracy of the model was assessed by its discrimination and calibration. Discrimination was measured by the area under the receiver operating characteristic curve (AUC). The AUC measures the model's ability to discriminate between patients with or without OC. An AUC of 0.5 indicates that the model provides no predictive discrimination, in other words, the model's value would be like tossing a coin, while a value of 1.0 indicates perfect discrimination between patients with or without OC [33]. Calibration, which compares predicted with actual pathology result, was evaluated by calibration curves for both the test and the validation sample [32].
A decision curve analysis (DCA) was applied to explore the clinical value of our newly derived model by increasing the net benefit over a realistic range of threshold probabilities. In this study, the threshold probability represented the risk of a patient to be diagnosed with OC [28,29].
All statistical tests were two sided, with significance set at a P-value <0.05. Analyses were conducted with SPSS 24.0 (SPSS 24.0.0, SPSS Inc., Chicago, IL, USA) and STATA 14.1 (Stata Corp. LP, College Station, TX, USA).

Fibrinogen as a predictive marker for ovarian cancer
Sensitivity and specificity for the detection of OC using our laboratory's upper limit of normal for plasma fibrinogen of 400 mg/dl were 55.2% and 78.5%, respectively. However, as the normal range does not necessarily reflect on the association between fibrinogen and malignant growth, we aimed to determine an optimal cut-off value for fibrinogen for the detection of ovarian malignancy in patients with adnexal masses. ROC curve analysis revealed an AUC of 0.74 (95% confidence interval [CI] 0.7-0.78) for fibrinogen to detect OC. In comparison, the ROC curve analyses for CA-125, the presence of M-criteria on ultrasound, and menopausal status revealed AUCs of 0.88 (95% CI 0.85-0.9), 0.73 (95% CI 0.69-0.77), and 0.69 (95% CI 0.65-0.73) for detecting OC, respectively. As missing an ovarian malignant tumour would have serious fatal consequences, we aimed to keep the false negative rate low thereby maximizing sensitivity. In accordance to the utility-based decision theory [34], based on ROC curve analysis, we chose a level of 342 mg/dl plasma fibrinogen as optimal cut-off value, reaching a sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for detecting OC of 78.0%, 58.3%, 40.4%, and 87.9% respectively. This cut-off value was used for further analyses, as it reflected the most optimal balance between NPV and PPV for OC. In comparison, sensitivity, specificity, PPV, and NPV for CA 125 serum level > 35 kU/L were 83.8%, 73.3%, 53.3%, and 92.6%, and for the presence of M-criteria on ultrasound 73.0%, 73.1%, 49.6%, and 88.2%, respectively.
In addition, we wanted to assess the added effect of elevated fibrinogen (>342 mg/dl) to elevated CA-125 (>35 kU/L) to detect ovarian malignancy in our cohort. A Spearman's correlation was run to assess the relationship between serum CA-125 and fibrinogen levels. There was only a weak correlation between the two biomarkers, which was statistically significant (Spearman's Rho (r s ) = 0.299; p<0.001). Thus, this indicates that only about 8.9% of the variance of one of the above biomarkers was explained by the other. The correlation between serum CA-125 and fibrinogen levels is shown in the scatter plot of S1 Fig. Next, we compared distributions of ovarian malignancy and benign ovarian tumours between the groups with both markers low, with only one of either fibrinogen or CA-125 elevated, or with both biomarkers elevated. In 22 (9.1%) out of 241 patients with ovarian malignancy only fibrinogen was elevated while CA-125 was found to be within normal range. Distributions of OC within the above named groups and the respective NPV, PPV, and number needed to treat (NNT) to detect one ovarian malignancy are shown in Table 2.

Risk-predicting model and decision curve analysis
In both uni-and multivariate logistic regression analyses, plasma fibrinogen > 342 mg/dl, CA-125 > 35 kU/L, postmenopausal status (or age > 50 years), and the presence of M-criteria on ultrasound were independently associated with a higher risk for the presence of OC. These four variables formed the basis for the nomogram. Results of uni-and multivariate logistic regression analyses are shown in Table 3. The differential benefit of adding fibrinogen to a base model that includes CA-125, menopausal status, and evaluation of M-criteria on ultrasound was approximately one per-cent. This would translate into 1 in 100 patients receiving more accurate prediction of ovarian cancer helping them to tailor treatment. The different ROC curves for fibrinogen, CA-125, the model with, and the model without the addition of fibrinogen are shown in Fig 1.  Fig 2 shows the nomogram for predicting OC in patients with adnexal masses based on menopausal status, presence of M-criteria on ultrasound, elevated CA-125 (> 35 U/ml), and elevated plasma fibrinogen (> 342 g/dl). Test and validation samples were well matched on all the characteristics investigated (S1 Table). The overall predictive accuracy of the model, as The nomogram is used by locating the score assigned for each predictive factor as depicted on the topmost "score" scale. Then a sum is calculated across all patient characteristics to obtain the "total score" that is eventually converted into the desired probability of OC. Of note, a higher total score corresponds with a higher probability of OC. For instance, we picture a premenopausal patient with an adnexal mass, presence of M-criteria on ultrasound, CA-125 of 60 kU/L, and plasma fibrinogen level of 300 mg/dl. By locating each of the characteristics on the corresponding scale and drawing a vertical straight line down to the "score" scale, we obtain scores of around 0, 7, 10, and 0, respectively, adding up to a total score of 17. Locating this value on the "total score" scale and drawing a line to the "probability of malignancy" scale a probability of 40% for OC can be estimated. On the other hand, if this same patient had a plasma fibrinogen level of 400 mg/dl, this would add 5 points to the score of 17, leading to a total score of 23. The estimated probability for OC would thereby change to 70%.
In the DCA (Fig 4), our model provided net benefit in predicting OC throughout nearly the entire range of threshold probabilities compared to treating all patients with surgery, or alternatively, treating no one. Only below a threshold of 5% probability for OC there would be no difference between applying our model and treating all patients. On the other hand, above a threshold of 90% probability for OC, deciding for treatment based on our model would have the same net benefit compared to treating no one. Net benefits and the percentages of interventions avoided for each respective threshold are listed in Table 4. For example, if a clinician would decide for surgery in patients with adnexal masses starting at a threshold probability for OC of 20%, by applying the present model the net reduction of unnecessary surgeries compared to treating all patients would be about 42.5 per 100 patients. In other words, at this probability threshold, deciding for surgery on the basis of our model is equivalent to a strategy that reduced the rate of surgery by 42.5% in patients without OC, without missing any OC. Moreover, if we perform surgery based on our prediction model, compared to treating none,

Fig 2. Nomogram to predict ovarian cancer in patients with adnexal masses.
To use the nomogram, locate the patient's variable on the corresponding axis; draw a line to the "score" axis, sum the scores, and draw a line from the "total score" axis to the "probability of malignancy" axis.  the net consequence is equivalent to remove true-positive OC in 19 patients per 100 and treating no unaffected patients. As shown in Table 4, the net benefit of our model increases with cumulative threshold probabilities.

Discussion
Adnexal masses are commonly found in women of all ages and the decision to surgically explore them or not often poses a clinical challenge. Surgical intervention carries inherent risks as well as the possibility of removing an ovary. Accurate prediction of the risk of harbouring an OC could help avoiding overtreatment as well as it could facilitate timely referral for surgical intervention in those at high risk for ovarian malignancy.
We confirmed the value of pre-operative plasma fibrinogen as a strong predictor for OC in a large cohort of patients with adnexal masses. Pre-operative fibrinogen was associated with the presence of OC in uni-and multivariate analyses that adjusted for the effects of menopausal status, ultrasound-based suspicion for malignancy, and elevated plasma CA-125. Based on these factors, we built a highly accurate nomogram (AUC 0.91) to help guide clinicians to tailor treatment in women who present with an adnexal mass. To assess whether this tool is worth using, we performed a DCA. The curve generated by DCA revealed, that over-treatment could be reduced by the use of the proposed nomogram, compared to treating all or no patients for a wide range of threshold probabilities.  Apart from being one of the key proteins within the formation of blood clots and inflammatory processes, plasma fibrinogen levels have been found to be elevated in patients with a poor prognosis in various malignant tumours including OC [20][21][22][23][24][25][26][27]. Moreover, our study group has previously investigated the predictive role of plasma fibrinogen to distinguish between benign and malignant ovarian tumours [25]. The results of the present study confirm the independent predictive value of elevated plasma fibrinogen to be associated with an increased risk for OC in patients with suspected adnexal masses in a very large cohort of patients. Indeed, as sole predictive marker for OC, plasma fibrinogen proved more accurate than the presence of criteria suspecting malignancy on ultrasound (M-criteria) or a postmenopausal status, while less accurate than plasma CA-125. The ROC curves for fibrinogen to detect OC were similar in both the present and the previously published study (AUC 0.74 vs. AUC 0.78, respectively). The slight variation between the AUC values of the current and the previous study is probably due to the now larger patient sample. In addition, we established a new cut-off value for plasma fibrinogen by ROC curve analysis with the aim of maximizing sensitivity while keeping it balanced with an acceptable specificity.
In contrast to our previous logistic regression model [25], where we included patients' age, serum CA-125 (< vs. ! 35 kU/L) and plasma fibrinogen (with the laboratories normal range as cut-off), we used our newly determined cut-off value for fibrinogen, replaced patients' age by menopausal status, and added the presence of M-criteria on ultrasound to the present model. Elevated plasma fibrinogen levels remained independently associated with an increased risk for OC in patients with adnexal masses even after adjusting for the effects of these established risk factors.
Various scoring systems have been investigated to pre-operatively estimate the risk for malignancy in patients with adnexal masses, two of which have been approved by the FDA. The ROMA incorporates measurements of CA-125 and HE4 along with the menopausal status of the patient. The score was originally evaluated in a prospective, multicentre, blinded clinical trial in 471 patients with adnexal masses [17]. The results of this study demonstrated a highly favourable sensitivity and specificity to detect OC of 93.8% and 74.9%, respectively. Since then, several studies have been performed reporting partly divergent results. A prospective study in 389 patients with adnexal masses, for example, reported no benefit of ROMA in comparison to the use of CA-125 alone for the detection of ovarian malignancy (ROC-AUC ROMA = 0.898 vs. ROC-AUC CA-125 = 0.877) [18]. The variance between the studies has been ascribed to differences in the composition of the patient cohorts [35]. The second test approved by the FDA, OVA1 TM Ovarian Triage Test, combines a panel of five biomarkers for ovarian cancer (CA-125, transthyretin, apolipoprotein A1, ß2-microglobulin, and transferrin) identified through serum proteomics using SELDI-TOF-MS. This multivariate index assay was demonstrated to perform better than CA-125 in detecting OC in a clinical trial in 524 patients diagnosed with adnexal mass, 161 of which were found to have ovarian malignancy. The OVA1 test thereby provided a sensitivity and specificity of 93% and 43%, respectively, with a PPV of 42% and a NPV of 93% [13]. However, in another study evaluating sera from 1.069 patients included in the Prostate Lung Colon and Ovary (PLCO) Cancer Screening Trial, the OVA1 test was not found to improve the detection of ovarian malignancy compared to CA-125 measurements alone [19]. Compared to the results of the ROMA and OVA1 test, the accuracy of our model was relatively similar. Obviously, comparing the results of our study to those of prospective clinical trials is lacking validity. Yet, a possible advantage of our model is the measurement of serum fibrinogen, which is readily available and cheap to perform, while costs for the FDAapproved tests are estimated between USD 60-130 for the ROMA and USD 600-650 for the OVA1 test [35].
The authors of a recent meta-analysis, evaluating numerous scoring systems for adnexal masses, summarized their findings by strongly recommending the incorporation of ultrasound-based prediction models into pre-operative characterization of adnexal pathology [7]. The IOTA study group has done substantial work by developing and validating diagnostic models for adnexal masses based on standardised ultrasound examination protocols and definitions [9,10,[36][37][38][39]. A meta-analysis, comparing various ultrasound-based prediction models for OC, suggested the IOTA group's models Logistic Regression (LR) 2 and Simple Rules (SR) to have the strongest test performance [40]. While transvaginal ultrasound is the key tool used in daily clinical practice for evaluating adnexal masses, a general criticism of the use of ultrasound as diagnostic test for distinguishing between a benign and malignant ovarian tumour is, that it is subjective and its performance strongly depends upon the experience and skills of the respective examiner. Due to lack of training and / or shortage of time, risk assessment in daily clinical practice is often based only on a subjective evaluation of ultrasound and, eventually, on pre-operative CA-125 measurement.
A nomogram is a prediction tool that incorporates various risk factors with the attempt to quantify the individualized probability of an outcome using a continuous risk scale. The graphic depiction of the probability of a particular outcome on a continuous scale, which is usually 0-100%, thereby provides a user-friendly interface, which does not require computer software for interpretation [30]. The model evaluated within the present study revealed a strong test performance with an AUC of 0.91. The simplicity of its design and the few numbers of variables make it easy to use in daily clinical practice. As CA-125 was the strongest predictive marker in our collective we wanted to assess the added value of fibrinogen. By performing a Spearman correlation we could demonstrate that the two biomarkers had only a weak correlation. This would emphasize a potential benefit of combining the two biomarkers, as they do not seem to be only each other's surrogate parameter. Furthermore, we compared distributions of OC between groups of patients generated according to the finding of elevated serum levels of fibrinogen alone, CA-125 alone, none, or both. We found that approximately 9% of patients with OC had elevated serum fibrinogen levels without elevation of CA-125. Based on the findings of serological biomarkers only, in these patients OC would have possibly been missed. Consequently, the PPV and NPV for detection of OC based on CA-125 and fibrinogen only were highest if both biomarkers were elevated. To assess the added value of fibrinogen to the model in a whole (including CA-125, fibrinogen, M-criteria on ultrasound, and menopausal status), we compared the ROC curves of the model with and without the addition of fibrinogen. The addition of fibrinogen to the model led to an increase of accuracy to detect OC of approximately one per-cent. This might seem to be only a minor improvement. Yet, fibrinogen is a cheap serological marker that is routinely measured in most patients prior to a surgical intervention. Therefore, improving the accuracy of prediction of ovarian malignancy in one out of a hundred patients that present with adnexal masses is justifying the addition of fibrinogen to the model.
While the AUC is indicating that our nomogram provides a strong predictive accuracy, it does not incorporate information on consequences and, therefore, cannot tell us whether the model is worth using at all. To examine the potential clinical impact of our predictive model, we performed DCA, a statistical method to estimate the clinical consequences of using predictive models [28,29]. The curve estimated by DCA for our nomogram demonstrated, that the use of our model to predict OC in patients with adnexal masses, and consequently, to help guiding clinicians and patients in the decision making process of whether to perform surgery or not, provided a net benefit relative to the two strategies of treating all, or alternatively, no one. The net benefit was given for threshold probabilities of OC between 5% and 90% and increased directly proportionally to the increment of threshold probabilities. The key advantage of DCA is the option of varying the threshold probability over an appropriate range. This is of importance, as patients may reasonably disagree about the appropriate threshold for deciding for surgery or even oophorectomy for an adnexal mass, depending on their age, their wish to preserve fertility and their general health status.
Strengths of the present study include the single institution uniform approach to care and its relatively large sample size, which enabled us to implement a split-sampling approach for validation of statistical results. However, there are some limitations that deserve to be mentioned. Within this study, data of a prospectively maintained database were retrospectively analysed. Therefore, the study has short-comings characteristic for a retrospective design, such as patient selection and incomplete data acquisition. Moreover, the model was based on patients who were selected for surgery only. Therefore, we cannot be certain, that the test performance of our model would remain unchanged if applied in a cohort of patients, of whom some would opt for expectant management. All patients were referred to and treated at a tertiary care centre specialized in gynaecologic oncology. Hence, the patient cohort of the present study might not be representative for the general population of women diagnosed with an adnexal mass at a clinic of a local gynaecologist. In addition, due to the retrospective design and the relatively long study period, ultrasound criteria used to determine suspicion for malignancy within this study were not standardized. However, in all patients, ultrasound was done by highly experienced examiners at our institution, a tertiary referral centre for suspicious adnexal masses, and can therefore be regarded as sound.
In conclusion, the current clinical management of patients with adnexal masses, detected by ultrasound results in a high rate of unnecessary surgical interventions [41].
Therefore, more accurate tools are needed to pre-operatively estimate the risk of OC in these patients. The nomogram described in the present study may guide decision-making process towards the most adapted surgical treatment options or expectant management strategies. DCA of the nomogram shows clinical benefit both for circumstances asking for a more hesitant attitude and for those allowing a more liberal approach towards surgical treatment of an adnexal mass. However, in order to prove generalizability, the current model requires external multi-institutional cohort validation. Further studies evaluating such predictive tools are needed especially to elucidate the benefit of combining our model with standardised ultrasound-based scores.