Evaluating diagnostic and early detection biomarkers requires comparing serum protein concentrations among biosamples ascertained from subjects with and without cancer. Efforts are generally made to standardize blood processing and storage conditions for cases and controls, but blood sample collection conditions cannot be completely controlled. For example, blood samples from cases are often obtained from persons aware of their diagnoses, and collected after fasting or in surgery, whereas blood samples from some controls may be obtained in different conditions, such as a clinic visit. By measuring the effects of differences in collection conditions on three different markers, we investigated the potential of these effects to bias validation studies.
Methodology and Principle Findings
We analyzed serum concentrations of three previously studied putative ovarian cancer serum biomarkers–CA 125, Prolactin and MIF–in healthy women, women with ovarian cancer undergoing gynecologic surgery, women undergoing surgery for benign ovary pathology, and women undergoing surgery with pathologically normal ovaries. For women undergoing surgery, a blood sample was collected either in the clinic 1 to 39 days prior to surgery, or on the day of surgery after anesthesia was administered but prior to the surgical procedure, or both. We found that one marker, prolactin, was dramatically affected by collection conditions, while CA 125 and MIF were unaffected. Prolactin levels were not different between case and control groups after accounting for the conditions of sample collection, suggesting that sample ascertainment could explain some or all of the previously reported results about its potential as a biomarker for ovarian cancer.
Citation: Thorpe JD, Duan X, Forrest R, Lowe K, Brown L, Segal E, et al. (2007) Effects of Blood Collection Conditions on Ovarian Cancer Serum Markers. PLoS ONE 2(12): e1281. https://doi.org/10.1371/journal.pone.0001281
Academic Editor: Alice Rumbold, Menzies School of Health Research, Australia
Received: October 17, 2007; Accepted: November 9, 2007; Published: December 5, 2007
Copyright: © 2007 Thorpe et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: We are grateful for generous funding from the Pacific Ovarian Cancer Research Consortium (POCRC)/SPORE in Ovarian Cancer (P50 CA83636, N.U.), the Department of Defense/CDMRP (DAMD17-02-1-0691, N.U.), and the Canary Foundation.
Competing interests: Elliot Segal is the President and CEO of Onco Detectors International LLC, which manufactures the MIF ELISA assay used in this study. No other competing interests exist.
We hypothesize that even with identical sample processing and storage protocols the environment and conditions of sample collection can affect the levels of biomarkers, and that these potential biases should be anticipated in biomarker validation study design. Specifically, the environment surrounding diagnosis and collection of specimens from cases, such as surgical preparation, may affect blood chemistry in a way that introduces systematic changes that may be mistakenly attributed to the disease state. We demonstrate these effects by evaluating conditions of blood collection in one established and two novel ovarian cancer serum markers: CA 125, Prolactin, and Macrophage Migration Inhibitory Factor (MIF). We show that CA 125 and MIF behave as previously reported but that prolactin's performance is strongly affected by biases in sample ascertainment.
Cancer early detection biomarker validation studies are designed to determine which proteins can distinguish between healthy people and those with cancer. In contrast, a diagnostic marker intends to distinguish between people with cancer and those with benign conditions. To potentially impact cancer mortality a marker must show abnormal levels in the blood of cases compared to their appropriate controls, and for early detection purposes they must elevate early enough in the disease process to identify the disease at an early and more treatable state . Evaluating a protein in pre-clinical specimens collected well before suspicion or diagnosis of cancer would be ideal for early detection studies, whereas samples obtained at clinical presentation of disease are most relevant for diagnostic markers. However, because pre-clinical specimens are seldom available, especially for rare diseases, first-phase early detection validation studies often seek to determine whether or not a marker can distinguish persons with symptomatic disease from healthy controls prior to further investment .
The primary intent of our biomarker validation study is to ascertain to what extent the classification performance of a biomarker can be attributed to disease associated response rather than to ascertainment biases in sample collection. It is common to construct case and control groups that are matched on sample collection protocols, storage duration, subject age, and other epidemiological information, in order to reduce potential biases related to these factors. Less emphasis has been placed on using multiple sources of control or case groups in order to detect potential biases or on using procedures that may adjust for biases, such as conditions of sample collection. In this manuscript we demonstrate the potential value in conducting biomarker validation studies using multiple sources of well annotated case and control groups. We demonstrate that prolactin is highly sensitive to the conditions of collection: after adjusting for the conditions of collection the marker is no longer considered a viable candidate. CA 125 and MIF are shown to not be highly susceptible to these conditions.
We selected three markers–CA 125, Prolactin and MIF–to evaluate in a highly annotated set of case and control specimens.
CA 125 is a mucin-like glycoprotein which has been shown to be elevated in most women with OC compared to a healthy population . CA 125 has also been evaluated in preclinical serum specimens, and each study suggests that CA 125 is a predictive marker that becomes increasingly powerful with proximity to diagnosis –. However, CA 125 is also elevated in several benign conditions and may also be a marker of inflammation . Due to insufficient sensitivity and specificity, CA125 is not used clinically as a stand alone screening test. Falling CA 125 levels after treatment are used to confirm response to specific treatments  and elevating CA 125 levels signal recurrence . CA 125 is a ligand of Mesothelin , which may play a role in the metastasis of OC to the peritoneum .
MIF is a proinflammatory cytokine which has been identified as a candidate early detection marker for OC , although analysis of its performance as a biomarker for early stage ovarian cancer suggested that it does not exhibit higher sensitivity or specificity than CA 125 . Inhibition of the anti-inflammatory properties of glucocorticoids is an important effect of MIF , . MIF may also mediate some of the stimulatory effects of inflammation on cancer progression. Evidence of MIF's role in the regulation of tumor-suppressor genes such as p53 ,  and angiogenesis ,  points to a potential link between chronic inflammation and the development of cancer.
Prolactin has been identified as a candidate early detection marker for ovarian cancer with reports of impressively high sensitivity (>90%) and specificity (>98%) . Elevated levels of circulating prolactin (hyperprolactinemia) have long been associated with pituitary tumors , but more recently prolactin has been reported in association with a variety of additional cancers, including breast –, prostate , and colon carcinoma .
Study population and serum specimen collection
Serum samples were collected by the Pacific Ovarian Cancer Research Consortium for use in biomarker validation experiments. The samples used in this study were collected at Swedish Medical Center or Virginia Mason Hospital (Seattle, WA, USA) between July 1, 2004 and June 30, 2006. Participants were recruited from the following populations: apparently healthy women attending regular breast cancer screening exams (healthy controls), women undergoing gynecologic surgery for a variety of conditions but with normal ovarian pathology (surgical controls), women without malignancy but with benign ovarian disease (benign controls), and women diagnosed with ovarian cancer, fallopian tube cancer, or primary peritoneal invasive cancer. Identical specimen processing protocols were used for all groups.
A sample of subjects from each of these conditions was selected for biomarker validation studies. Patients with prior oophorectomy or diagnosis of ovarian cancer were excluded from the study population. Cases included 50 consecutively recruited patients with ovarian (n = 45), fallopian tube (n = 1), and peritoneal cancer (n = 4). Control groups included healthy controls (n = 36), surgical controls (n = 14), and benign controls (n = 30). The validation study was powered to detect a marker with 30% sensitivity at 95% specificity, or better. Demographics of the patients included in this study are described in Supplementary Tables S1 and S2.
The healthy, surgical and benign controls used in this study were selected from larger control populations (n = 346, 63, and 38 respectively) to match the cases on age, race, family history of ovarian and breast cancer , and blood collection date. We used propensity score matching to balance the overall distribution of the groups . Briefly, a propensity score was estimated by predicting case status using logistic regression on each of the variables of interest. After first selecting the case group, individual controls were selected that most closely matched a randomly identified member of the case group on the assigned propensity score until pre-specified numbers for each control group had been selected.
Participants in the surgical control, benign control and case populations donated serum specimens either at a pre-surgical appointment 1 to 39 days prior to surgery or on the day of surgery after administration of anesthesia but before surgical treatment or chemotherapy. To maximize the power to detect differences in marker levels due to conditions of collection, we included specimens collected both on the day of surgery and at the pre-surgical appointment from the same patient (n = 30) whenever possible. Participants in the healthy control population donated blood at a regular mammography screening appointment.
Prolactin and MIF Assays.
Serum levels of prolactin and MIF were measured by ELISA using kits acquired from Diagnostic Systems Laboratories (Webster, TX) and Onco Detectors International LLC (Bethesda, MD) respectively. Assays were performed according to manufacturer's instructions. The concentrations of human prolactin and MIF were determined using a linear standard curve that was constructed by plotting the mean absorbance against the known concentration for each reference standard. See Text S1 for details.
CA 125 Assay.
Serum levels of CA 125 were measured by bead-based immunoassays as previously described  using anti-CA 125 mouse monocolonal antibodies ×306 (capture) and ×52 (detection) acquired from Research Diagnostics, Inc (RDI, Flanders, NJ). Readings from the immunoassay were normalized and then z-scores were calculated by centering and scaling observations so that healthy controls have mean 0 and variance 1. See Text S1 for details.
Specimens were randomized onto two plates with 80 specimens each, and laboratory personnel were blinded to case status at all times.
Receiver operating curves (ROC) were used to determine if serum marker concentrations discriminated between cases and healthy controls . The area under each ROC curve (AUC) was calculated and significance for marker discrimination (AUC different from 0.5) was determined using the Mann-Whittney U statistic. ROC curves for healthy control samples and case samples collected either prior to surgery or on the day of surgery for each marker were compared using the method described by Metz et al .
To evaluate whether marker levels differed between case and control groups after adjusting for conditions of blood collection, we fitted multiple linear regression models to each marker as the dependent variable with indicator variables for each case/control population and an indicator variable for conditions of blood sample collection (clinic visit or in surgery) as independent variables. The regression model for the ith woman at time t was:
The reference group in each model is the healthy control group. This model can potentially separate the components of variance due to conditions of sample collection and presence of malignancy. In particular, for markers that elevate due to the presence of ovarian cancer and are also affected by the conditions of blood collection, each effect can be estimated from the model parameters. Regressions were performed using Generalized Estimating Equations (GEE) methods to avoid bias in estimates of standard errors because marker levels were measured twice for 30 women in the study.
P-values for differences between partially correlated ROC curves were calculated with the ROCKIT software package using the bivariate test. All other calculations were performed using the R statistical programming language.
Marker levels from each case/control group collected in surgery and at the pre-surgical clinic visit are shown in Figure 1 and summarized in Table 1. ROC analysis showed that CA 125 and MIF concentrations discriminate between healthy controls and cases collected either at surgery or 1 to 39 days prior to surgery (figure 2a ,b; p<0.05 for each marker and condition). Moreover, the AUCs were not significantly different between the two collection conditions (figure 2a,b; p = 0.297 and 0.416 respectively).
Dotted lines connect surgical and pre-surgical marker levels measured within the same women under both surgical and non-surgical conditions
Case specimens were obtained either at surgery (surgical comparison; dashed line) or 1 to 39 days prior to surgery (pre-surgical comparison; solid line). The pre-surgical comparison suggests that prolactin levels do not discriminate between women with and without cancer in the clinic setting. * indicates AUC different from 0.5 at alpha = 0.05 significance level (Mann Whitney U test)
Prolactin levels were highly elevated in the case specimens collected at surgery (figure 1c) and prolactin levels discriminated between case specimens collected at surgery and healthy controls with high sensitivity and specificity (figure 2c, dotted line). However, this difference disappeared when we compared case specimens collected 1 to 39 days prior to surgery to the healthy controls (figure 2c, solid line). The AUC for discriminating between cases and controls was significantly lower in specimens collected in the short interval prior to surgery than for the specimens obtained at surgery (figure 2c, pdifference in AUC<0.0005). Moreover, serum prolactin levels did not discriminate between healthy controls and case specimens collected 1 to 39 days prior to surgery (figure 2c, solid line AUC = 0.497).
We used multiple linear regression models to examine whether differences in marker levels were associated with case status and/or with conditions of blood sample collection. In the regression models, CA 125 and MIF concentrations were not significantly affected by the conditions of blood collection (table 2, p = 0.60 and 0.71 respectively) and were elevated in the cases relative to the healthy controls (table 2, p<0.005 for each marker). Prolactin levels, however, were significantly increased in serum samples collected at surgery (table 2, p<0.005) and after adjusting for conditions of blood collection, prolactin was not elevated in cases relative to healthy controls (table 2, p = 0.69). These data suggest that the differences observed with prolactin can be attributed entirely to blood collection conditions, with no residual signal associated with malignancy.
The approach of using commercially available assays to validate candidate biomarkers is very promising. However, results can be misleading if conditions of the blood sample collection for cases and controls are not standardized or otherwise accounted for. We show here that serum prolactin levels are strongly influenced by the conditions of blood collection and that prolactin does not discriminate between cancer and non-cancer patients in serum specimens collected similarly in a clinic setting. In contrast, CA 125 and MIF were not affected by the conditions of blood collection; both markers discriminated between cases and controls irrespective of whether serum specimens were collected at surgery or in a short interval prior to surgery.
This finding is consistent with previous reports that prolactin levels elevate during surgery and post-operatively in female patients undergoing surgery with halothane (general) anesthesia . Prolactin levels are also elevated in rats undergoing general anesthesia with pentobarbital, regardless of surgery . In our study, specimens collected on the day of surgery were obtained after general anesthesia was administered but before any incisions were made. Serum prolactin levels at surgery may have been affected by anesthesia or by other conditions of surgery such as stress .
In multiple linear regression models, differences in CA125 and MIF levels were associated with case status but not by the conditions of sample ascertainment. For prolactin, the reverse was true suggesting that prolactin levels are affected by the conditions of surgery and may not be a marker of ovarian cancer. These multivariate analyses complemented the ROC analyses by adjusting for the conditions of blood collection, thus allowing for the possibility that a marker signals malignancy despite being affected by the conditions of blood collection. Adjustment for collection conditions in the analysis is useful more generally when blood samples collected under identical conditions are not available from every participant in a study.
The use of multiple sources of control specimens collected under various conditions may alert researchers to potential biases. We have demonstrated that permitting collection conditions to vary in cases and controls but using correct annotations may alert researchers to potential problems. Whenever it is not feasible to obtain multiple collections from cases (both within and outside of surgery) the use of surgical controls can be used as a screen for the possible effects of collection condition. For example, it can be seen in figure 1c that prolactin levels are higher in the control groups where samples were collected at surgery than in healthy controls, again suggesting that elevated prolactin levels may not be specific to malignancy.
The limited availability of pre-clinical specimens from ovarian cancer patients presents a significant challenge to researchers trying to discover or validate novel biomarkers for early detection. The majority of specimens from cancer patients that are available for research are not collected from women or clinicians who are blind to their impending diagnosis. Our results illustrate that biases between case and control populations can lead to false positive experimental results and that controlling for conditions of blood collection can reduce false discovery and false validation in biomarker experiments. It is important to detect, and whenever possible to correct for, biases in conditions of blood collection when attempting to discover and validate novel biomarkers.
Summary of patient demographics by case status
(0.03 MB DOC)
Summary of ovarian cancers by stage and histology
(0.03 MB DOC)
Conceived and designed the experiments: NU MM GA JT. Performed the experiments: BN RF XD. Analyzed the data: NU MM GA JT KL. Contributed reagents/materials/analysis tools: ES. Wrote the paper: JT LB. Other: Made considerable contributions in editing of the manuscript: NU Made considerable contributions in editing of the manuscript, especially in the laboratory methods section: BN Made considerable contributions in editing of the manuscript, especially in the laboratory methods section: RF Made considerable contributions in editing of the manuscript, especially in the statistical methods section: KL Wrote portions of the introduction and helped edit the remainder of the manuscript: LB Wrote portions of the laboratory methods section: XD.
- 1. Pepe MS, Etzioni R, Feng Z, Potter JD, Thompson ML, et al. (2001) Phases of biomarker development for early detection of cancer. J Natl Cancer Inst 93: 1054–1061.
- 2. Pepe M (2005) Evaluating technologies for classification and prediction in medicine. Statistics in Medicine 24: 3687–3696.
- 3. Rosenthal AN, Menon U, Jacobs IJ (2006) Screening for ovarian cancer. Clin Obstet Gynecol 49: 433–447.
- 4. Pauler DK, Menon U, McIntosh M, Symecko HL, Skates SJ, et al. (2001) Factors Influencing Serum CA125II Levels in Healthy Postmenopausal Women. Cancer Epidemiol Biomarkers Prev 10: 489–493.
- 5. Skates SJ, Menon U, MacDonald N, Rosenthal AN, Oram DH, et al. (2003) Calculation of the Risk of Ovarian Cancer From Serial CA-125 Values for Preclinical Detection in Postmenopausal Women. J Clin Oncol 21: 206s–210.
- 6. Bjorge T, Lie A, Hovig E, Gislefoss R, Hansen S, et al. (2004) BRCA1 mutations in ovarian cancer and borderline tumours in Norway: a nested case-control study. Br J Cancer 91: 1829–1834.
- 7. Daoud E, Bodor G (1991) CA-125 concentrations in malignant and nonmalignant disease. Clin Chem 37: 1968–1974.
- 8. Rustin GJ, Nelstrop AE, McClean P, Brady MF, McGuire WP, et al. (1996) Defining response of ovarian carcinoma to initial chemotherapy according to serum CA 125. J Clin Oncol 14: 1545–1551.
- 9. Niloff JM, Knapp RC, Lavin PT, Malkasian GD, Berek JS, et al. (1986) The CA 125 assay as a predictor of clinical recurrence in epithelial ovarian cancer. Am J Obstet Gynecol 155: 56–60.
- 10. Gubbels JA, Belisle J, Onda M, Rancourt C, Migneault M, et al. (2006) Mesothelin-MUC16 binding is a high affinity, N-glycan dependent interaction that facilitates peritoneal metastasis of ovarian tumors. Mol Cancer 5: 50.
- 11. Rump A, Morikawa Y, Tanaka M, Minami S, Umesaki N, et al. (2004) Binding of ovarian cancer antigen CA125/MUC16 to mesothelin mediates cell adhesion. J Biol Chem 279: 9190–9198.
- 12. Mor G, Visintin I, Lai Y, Zhao H, Schwartz P, et al. (2005) Serum protein markers for early detection of ovarian cancer. Proc Natl Acad Sci U S A 102: 7677–7682.
- 13. Agarwal R, Whang DH, Alvero AB, Visintin I, Lai Y, et al. (2007) Macrophage migration inhibitory factor expression in ovarian cancer. Am J Obstet Gynecol 196: 348 e341–345.
- 14. Bucala R, Donnelly SC (2007) Macrophage migration inhibitory factor: a probable link between inflammation and cancer. Immunity 26: 281–285.
- 15. Lolis E (2001) Glucocorticoid counter regulation: macrophage migration inhibitory factor as a target for drug discovery. Curr Opin Pharmacol 1: 662–668.
- 16. Mitchell RA, Liao H, Chesney J, Fingerle-Rowson G, Baugh J, et al. (2002) Macrophage migration inhibitory factor (MIF) sustains macrophage proinflammatory function by inhibiting p53: regulatory role in the innate immune response. Proc Natl Acad Sci U S A 99: 345–350.
- 17. Hudson JD, Shoaibi MA, Maestro R, Carnero A, Hannon GJ, et al. (1999) A proinflammatory cytokine inhibits p53 tumor suppressor activity. Journal Of Experimental Medicine 190: 1375–1382.
- 18. Chesney J, Metz C, Bacher M, Peng T, Meinhardt A, et al. (1999) An essential role for macrophage migration inhibitory factor (MIF) in angiogenesis and the growth of a murine lymphoma. Mol Med 5: 181–191.
- 19. Hira E, Ono T, Dhar DK, El-Assal ON, Hishikawa Y, et al. (2005) Overexpression of macrophage migration inhibitory factor induces angiogenesis and deteriorates prognosis after radical resection for hepatocellular carcinoma. Cancer 103: 588–598.
- 20. Freeman ME, Kanyicska B, Lerant A, Nagy G (2000) Prolactin: structure, function, and regulation of secretion. Physiol Rev 80: 1523–1631.
- 21. Mujagic Z, Mujagic H (2004) Importance of serum prolactin determination in metastatic breast cancer patients. Croat Med J 45: 176–180.
- 22. Mujagic Z, Mujagic H, Prnjavorac B (2005) Circulating levels of prolactin in breast cancer patients. Med Arh 59: 33–35.
- 23. Tworoger SS, Eliassen AH, Rosner B, Sluss P, Hankinson SE (2004) Plasma Prolactin Concentrations and Risk of Postmenopausal Breast Cancer. Cancer Res 64: 6814–6819.
- 24. Nevalainen MT, Valve EM, Ingleton PM, Nurmi M, Martikainen PM, et al. (1997) Prolactin and Prolactin Receptors Are Expressed and Functioning in Human Prostate. J Clin Invest 99: 618–627.
- 25. Indinnimeo M, Cicchini C, Memeo L, Stazi A, Ghini C, et al. (2001) Plasma and tissue prolactin detection in colon carcinoma. Oncol Rep 8: 1351–1353.
- 26. Andersen MR, Smith R, Meischke H, Bowen D, Urban N (2003) Breast cancer worry and mammography use by women with and without a family history in a population-based sample. Cancer Epidemiology, Biomarkers & Prevention 12: 314–320.
- 27. Ho D, Imai K, King G, Stuart E (2006) MatchIt: Nonparametric Preprocessing for Parametric Casual Inference. R package version 2.2-7 ed.
- 28. Scholler N, Crawford M, Sato A, Drescher CW, O'Briant KC, et al. (2006) Bead-Based ELISA for Validation of Ovarian Cancer Early Detection Markers. Clin Cancer Res 12: 2117–2124.
- 29. Sing T, Sander O, Beerenwinkel N, Lengauer T (2005) ROCR: Visualizing the performance of scoring classifiers. R package version 1.0-1, ed.
- 30. Metz CE, Herman BA, Roe CA (1998) Statistical comparison of two ROC-curve estimates obtained from partially-paired datasets. Med Decis Making 18: 110–121.
- 31. Metz C (1998) ROCKIT 1.1B2. Beta Version ed. Chicago: Department of Radiology, University of Chicago.
- 32. Team RDC (2006) R: A language and environment for statistical computing.
- 33. Hagen C, Brandt MR, Kehlet H (1980) Prolactin, LH, FSH, GH and cortisol response to surgery and the effect of epidural analgesia. Acta Endocrinol (Copenh) 94: 151–154.
- 34. Donnerer J, Lembeck F (1990) Different control of the adrenocorticotropin-corticosterone response and of prolactin secretion during cold stress, anesthesia, surgery, and nicotine injection in the rat: involvement of capsaicin-sensitive sensory neurons. Endocrinology 126: 921–926.