Conceived and designed the experiments: BH TY RB ZF KP KM. Performed the experiments: TY KP KM. Analyzed the data: LL ZF TY BH KM BF. Contributed reagents/materials/analysis tools: RB MA DB DS ZF KK HZ AM. Wrote the paper: TY BH.
The authors have declared that no competing interests exist.
The CA 19-9 assay detects a carbohydrate antigen on multiple protein carriers, some of which may be preferential carriers of the antigen in cancer. We tested the hypothesis that the measurement of the CA 19-9 antigen on individual proteins could improve performance over the standard CA 19-9 assay. We used antibody arrays to measure the levels of the CA 19-9 antigen on multiple proteins in serum or plasma samples from patients with pancreatic adenocarcinoma or pancreatitis. Sample sets from three different institutions were examined, comprising 531 individual samples. The measurement of the CA 19-9 antigen on any individual protein did not improve upon the performance of the standard CA 19-9 assay (82% sensitivity at 75% specificity for early-stage cancer), owing to diversity among patients in their CA 19-9 protein carriers. However, a subset of cancer patients with no elevation in the standard CA 19-9 assay showed elevations of the CA 19-9 antigen specifically on the proteins MUC5AC or MUC16 in all sample sets. By combining measurements of the standard CA 19-9 assay with detection of CA 19-9 on MUC5AC and MUC16, the sensitivity of cancer detection was improved relative to CA 19-9 alone in each sample set, achieving 67–80% sensitivity at 98% specificity. This finding demonstrates the value of measuring glycans on specific proteins for improving biomarker performance. Diagnostic tests with improved sensitivity for detecting pancreatic cancer could have important applications for improving the treatment and management of patients suffering from this disease.
Several factors contribute to the extremely poor prognosis associated with pancreatic cancer, including the resistance of the disease to available therapeutic options, its tendency to metastasize at small primary tumor sizes, and its induction of cachexia
The CA 19-9 serum marker is elevated in the majority of pancreatic cancer patients but does not achieve the performance required for either early detection or diagnosis, due to both false positive and false negative readings
The nature of the CA 19-9 antigen suggests a strategy for potentially improving biomarker performance. The CA 19-9 antigen is a carbohydrate structure called sialyl LewisA (part of the Lewis family of blood group antigens) with the sequence Neu5Acα2,3Galβ1,3(Fucα1,4)GlcNAc. Sialyl LewisA is synthesized by glycosyltransferases that sequentially link the monosaccharide precursors onto both N-linked and O-linked glycans. Sialyl LewisA is not found at a high level in normal tissues, but it is found in embryonic tissue
It is possible that the carrier proteins of the CA 19-9 antigen are different between disease states, as suggested earlier
We used antibody arrays to measure the level of the CA 19-9 antigen on specific proteins in multiple samples. Serum and plasma samples were incubated on antibody arrays, and the arrays were probed with the CA 19-9 antibody (
a) High-throughput sample processing and array-based sandwich assays for CA19-9 detection. Forty-eight identical arrays are printed on one microscopic slide, segregated by hydrophobic wax boundaries (left). A set of serum or plasma samples are incubated on the arrays in random order, and the arrays for the entire sample set are probed with the CA 19-9 detection antibody (right). b) Molecular detail. Total CA19-9 is measured at the CA19-9 capture antibody (left), and CA19-9 on specific proteins is measured at the individual antibodies against those proteins (right). b) Representative raw image data from each of the sample groups. Triplicates of each antibody were randomly positioned on the array, as indicated for selected antibodies.
In order to determine which antibodies should be used to profile CA 19-9 levels over many patients, we profiled a pilot set of 12 serum samples (6 from pancreatic cancer patients and 6 from pancreatitis patients) using arrays containing 58 different antibodies (
Based on the above result, subsequent experiments were performed using arrays targeting CA 19-9 and the mucin proteins MUC1, MUC5AC, and MUC16 (see
Set # | Set provider | Early-stage cancer (Stage I, II) | Undetermined stage cancer | Late-stage cancer (Stage III, IV) | Pancreatitis | Healthy | Total |
1 | University of Pittsburgh (UP) | 54 | 13 | 58 | 51 | 54 | |
2 | Evanston Northwestern Healthcare (ENH) | 60 | 9 | 63 | 36 | 52 | 531 |
3 | University of Michigan (UM) | 28 | 15 | 38 |
The first goal of the analysis was to determine whether the detection of the CA 19-9 antigen on any individual protein performed as well or better than the standard CA 19-9 assay (referred to as total CA 19-9). Each of the proteins MUC1, MUC5AC, and MUC16 showed significantly higher levels in the cancer patients than in the pancreatitis patients, both for early and late stage cancers (
The fluorescence values for the total CA 19-9 (top), CA19-9 on MUC1 (second row), CA 19-9 on MUC16 (third row), and CA 19-9 on MUC5AC (fourth row) are shown for each sample group. The left column compares samples from pancreatitis patients to samples from early-stage pancreatic cancer patients, and the right column compares pancreatitis to late-stage cancer. The sensitivity and specificity at the threshold indicated by the dash line are given.
We next investigated the relationships between total CA 19-9 and CA 19-9 on individual proteins to determine whether elevations occur independently from one another. If non-overlapping patients are elevated in separate markers, the markers could be used together to yield improved performance. This potential was supported by the lack of significant correlation between total CA 19-9 and CA 19-9 on individual proteins or between the individual proteins (not shown).
The primary images from selected samples provided insights into the diversity between samples in the carrier proteins that display the CA 19-9 antigen (
Raw antibody images are shown for patient samples representing diverse marker patterns. Data from sample set 3 (replicate 1) are presented. A cancer sample (labeled ‘True positive’) and pancreatitis sample (labeled ‘False positive’) that were high in total CA 19-9 (above a 75% specificity threshold) are in the top left, and pancreatitis samples that were low in total CA 19-9 (‘True negatives’) are in the bottom left. Cancer samples that were low in total CA 19-9 are grouped by relatively high or low signal at one of the mucins in the top right and bottom right, respectively. The sample identifier is given within each array. In the subgroup picked up by the panel (top-right), the antibody showing elevation in a given sample is listed adjacent to each array. The corresponding antibody spots are underlined in white. Two arrays for sample LC3607 are shown, one detected with BPL (rightmost column, row 2), and the other detected with CA19-9 (rightmost column, row 3). All other arrays were detected with CA19-9. The bottom panels show maps of antibodies targeting MUC16 (left), MUC5AC (middle), and MUC1 (right).
The possibility of detecting other glycans to complement the CA 19-9 antigen was suggested by the primary images (
Because no single protein is the dominant cancer-specific carrier of the CA 19-9 antigen, the detection of CA 19-9 on any of these individual proteins does not out-perform total CA 19-9. However, for individual patients, the detection of the CA 19-9 antigen on the predominant cancer-associated carrier for that patient may give improved discrimination of benign from malignant disease, relative to the total CA 19-9 assay. A panel of such markers, in which each member of the panel detects a subgroup of patients elevated in a certain carrier protein, could thus yield improved performance.
The above observations led to the investigation of whether CA 19-9 on individual proteins could complement total CA 19-9 measurements for improved biomarker performance. The relationship between the measurements of total CA 19-9 and CA 19-9 on certain individual proteins showed this possibility (
Each scatter plot compares the values for total CA 19-9 (x axis) to the values for CA 19-9 on MUC16 or MUC5AC. Each point is an individual sample. Samples from Set 1 are presented at top, and samples from Set 2 are presented in the bottom panels. The dashed lines indicate representative thresholds for each marker. The sensitivity and specificity given in each graph represents the performance at those thresholds if a sample exceeding either threshold is called a “case.” The red arrows indicate the samples that are not elevated in total CA 19-9 but are elevated in CA 19-9 on an individual protein. Each ROC curve shows the performance of CA 19-9 alone and the combination of CA 19-9 with the indicated marker. If a sample was elevated in either marker, it was called a “case.” The asterisk indicates the performance at the thresholds in the scatter plots.
We next asked whether MUC5AC and MUC16 could be used together with total CA 19-9 to give additional improvement in discriminating cases from controls. The three markers were combined by defining a “case” as having an elevation in at least one of the three markers and a “control” as being low in all three markers. For such a combination rule, the thresholds for each marker need to be individually set to give the best combined performance. We scanned through the possible combinations of thresholds for the three markers that would give a minimum specificity of 98% (two false positives), which was chosen to reveal cancer-specific patterns. A set of thresholds was achieved in which most patients were elevated in total CA 19-9 and another, smaller group was elevated in either CA 19-9-MUC5AC or CA 19-9-MUC16 (
Each column represents data from a patient sample and each row represents a marker, with the bottom row indicating the patient classification. A threshold was set for each marker, and a yellow square indicates the sample was above the threshold for that marker, and black indicates below the threshold. In the final row, a yellow square indicates the sample was elevated in any of the three markers and classified as a “case.” The true positive (TP) cancer cases that were elevated in CA 19-9 are indicated by ‘TP, CA 19-9’, and the true positive cases elevated only in the other markers are indicated by ‘TP, Panel.’ The false negative (FN) cancer cases are indicated by ‘FN,’ the false positive (FP) control cases that were elevated in a marker are indicated by ‘FP,’ and the true negative (TN) control cases that were low in all markers are indicated by ‘TN.’ Data from Sample Set 1 is presented at top, and data from Sample Set 2 is below.
Set 1 | Specificity | Sensitivity | Accuracy | |
All samples | CA 19-9 alone | 98% (103/105) | 63% (79/125) | 79% |
Panel | 98% (103/105) | 74% (93/125) | 85% | |
Early stage | CA 19-9 alone | 98% (103/105) | 54% (29/54) | 83% |
Panel | 98% (103/105) | 59% (32/54) | 85% | |
Late stage | CA 19-9 alone | 98% (103/105) | 70% (50/71) | 86% |
Panel | 98% (103/105) | 86% (61/71) | 93% |
Set 2 | ||||
All samples | CA 19-9 alone | 98% (86/88) | 56% (74/132) | 73% |
Panel | 98% (86/88) | 67% (88/132) | 79% | |
Early stage | CA 19-9 alone | 98% (86/88) | 42% (25/60) | 75% |
Panel | 98% (86/88) | 52% (31/60) | 79% | |
Late stage | CA 19-9 alone | 98% (86/88) | 68% (49/72) | 84% |
Panel | 98% (86/88) | 79% (57/72) | 89% |
Standard CA 19-9 had a lower sensitivity for early-stage cancer than late-stage cancer (
While it was a relatively small subset of CA 19-9-low patients that were picked up by the panel, the marker patterns were consistent between the independent sample sets. In both sets 1 and 2, about a third of the patients detected by the panel were elevated in only CA19-9-MUC5AC, another third elevated in only CA19-9-MUC16, and another third elevated in both. The consistency between the sample sets in the overall results supports that the various patterns of marker expression, high in all or high in individual members of the panel, represent biological subgroups that may be observed in the larger population.
The use of three independent sets also gave information about the relative merits of serum and plasma, since sample set 1 was plasma and sets 2 and 3 were serum. The same markers were found to be effective between sets 1, 2, and 3, with similar relationships to total CA 19-9. That result suggests that the relative levels between cases and controls are not greatly affected by the mode of preparation of the sample. In addition, the reproducibility of the measurements was similar between serum and plasma. Each data set included repeated series of dilutions of pooled samples. At a 10-fold dilution of the pool, representing concentrations similar to the individual sample, the coefficient of variation between the replicate measurements was 23% for the serum samples (from Set 2) and also 23% for the plasma samples (from Set 3) (data not shown). That finding suggests that the stability of the markers is not greatly affected by whether the samples are prepared as serum or plasma.
The need for improved blood markers for pancreatic cancer is great. Such markers would have important applications in the detection and diagnosis of the disease, leading to improved patient management and outcomes. The sub-optimal performance of the CA 19-9 assay may, in some cases, be due to the appearance of the CA 19-9 carbohydrate antigen on carrier proteins that are not specific to cancer. By detecting the antigen specifically on the proteins that are the predominant carriers in cancer, improved performance may result. We examined this possibility using antibody arrays with glycan detection, which provided a convenient approach to measuring the CA 19-9 antigen on multiple, individual proteins. We found that the mucins MUC1, MUC5AC, and MUC16 are indeed major cancer-associated carriers of CA 19-9, but because of the diversity among patients in the proteins that carry CA 19-9, the detection of CA 19-9 on any single protein did not out-perform total CA 19-9. However, for individual patients with low CA 19-9 in which a predominant carrier was identified, selective discrimination from the pancreatitis controls was possible. A combination marker comprising total CA 19-9 plus CA 19-9 on selected proteins could yield improved sensitivity of cancer detection over total CA 19-9 alone. Similar results were observed in two independent sample sets from two different institutions. This work demonstrates the potential of improving detection accuracy using glycan measurements on individual proteins.
A new biomarker to more sensitively detect cancer relative to benign disease conditions could be significant in a variety of ways. A possible area of application would be to diagnose patients that have pancreatic abnormalities as discovered by CT scan. Several conditions in addition to malignancies produce abnormal pancreatic findings by CT
Future work in the development of a biomarker includes further validating and characterizing the improved sensitivity of the current marker panel and determining the panel's ability to meet the performance needs of specific clinical applications. The most effective validation will make use of samples that were collected in the clinical setting and patient population intended for eventual use, in this case patients with pancreatic abnormalities who are being considered for referral for further diagnostic workup. In addition, it will be important to develop clinical assays for these markers. Clinical assays would ensure lack of interference from potentially confounding factors and would provide the precision and control over variability that are required to fully assess marker performance.
Further biomarker discovery could be targeted to the subgroup not detected by the panel (
Improving the limit of detection of the analytical assay may also enhance the ability to detect the cancer patients. Some of the patients not detected in this study may have mucin proteins secreted into the blood but at very low levels, which might be detectable given a very sensitive assay. This point may be especially important for early-stage cancer patients, which are likely to have lower concentrations of tumor markers. Our data show that we detect a subset of early-stage cancer patients (
The subgroups identified in this work may represent biologically distinct subgroups of pancreatic cancer that have clinical implications. Studies of cancers of other organs have identified subcategories of disease defined by molecular characteristics
The approach to biomarker development demonstrated here may be useful in other biomarker applications. The detection of glycans on specific proteins may yield greater accuracy for a variety of disease states than by detecting just protein levels, as with standard immunoassays, or just the levels of a particular glycan on all proteins, as with the conventional CA 19-9 assay. The antibody-lectin sandwich array provides an ideal format for testing combinations of proteins and glycans for such investigations
In summary, an improvement over the conventional CA 19-9 assay may be achievable by detecting the CA 19-9 antigen on specific proteins rather than on all protein carriers. The identification of subgroups of patients based on CA 19-9 carrier status suggests biologically distinct entities of the disease that will be will be optimally detected by complementary markers. Using a combination of total CA 19-9 and CA 19-9 on individual proteins, the sensitivity of cancer detection was improved relative to CA 19-9 alone in two independent sample sets from two different institutions, achieving 67-80% sensitivity at 98% specificity. The expansion of this panel with additional glycans and protein carriers should further improve performance. Validation will be performed using blinded samples collected from the setting of the intended clinical application, in accordance with the developed standards for biomarker validation
All sample collection and research was conducted under protocols approved by the Institutional Review Boards at Evanston Northwestern Healthcare, the University of Michigan Medical School, the University of Pittsburgh School of Medicine, and the Van Andel Research Institute. Written, informed consent was obtained from all participants in the study.
Serum samples from Evanston Northwestern Healthcare and the University of Michigan Medical School and plasma samples (using EDTA as the anti-coagulant) from the University of Pittsburgh School of Medicine were collected from pancreatic cancer, pancreatitis and healthy subjects (
The antibodies and lectins were obtained from various sources (see
Approximately 170 pg (350 pl at 500 µg/ml or 700 pl at 250 µg/ml) of each antibody was spotted on the surfaces of ultra-thin nitrocellulose-coated microscope slides (PATH slides, GenTel Biosciences) by a non-contact microarrayer (sciFLEXARRAYER, Scienion) performed at GenTel Biosciences (Madison, WI) for the slides used in replicates 1 and 2 of sample set 1, and by a contact microarrayer (2470, Aushon Biosciences) for the rest of the experiments. Forty-eight identical arrays containing triplicates of all antibodies were printed on each slide. Hydrophobic borders were imprinted around each array using a stamping device (SlideImprinter, The Gel Company, San Francisco, CA).
Microarray sandwich assays were performed to measure either the level of total CA19-9 or the glycan levels on the proteins captured by the immobilized antibodies (
The measurement of glycans by using lectins detection on the captured proteins (
Pearson correlations, Student's T-tests, and receiver-operator characteristic analyses were calculated using Microsoft Excel. The scatter and box plots were created using OriginPro 8, and figure production was performed using Canvas X.
(TIF)
(TIF)
(TIF)
(DOCX)
(DOCX)
(DOCX)