Autoantibody Profiling for Lung Cancer Screening Longitudinal Retrospective Analysis of CT Screening Cohorts

Recommendations for lung cancer screening present a tangible opportunity to integrate predictive blood-based assays with radiographic imaging. This study compares performance of autoantibody markers from prior discovery in sample cohorts from two CT screening trials. One-hundred eighty non-cancer and 6 prevalence and 44 incidence cancer cases detected in the Mayo Lung Screening Trial were tested using a panel of six autoantibody markers to define a normal range and assign cutoff values for class prediction. A cutoff for minimal specificity and best achievable sensitivity were applied to 256 samples drawn annually for three years from 95 participants in the Kentucky Lung Screening Trial. Data revealed a discrepancy in quantile distribution between the two apparently comparable sample sets, which skewed the assay’s dynamic range towards specificity. This cutoff offered 43% specificity (102/237) in the control group and accurately classified 11/19 lung cancer samples (58%), which included 4/5 cancers at time of radiographic detection (80%), and 50% of occult cancers up to five years prior to diagnosis. An apparent ceiling in assay sensitivity is likely to limit the utility of this assay in a conventional screening paradigm. Pre-analytical bias introduced by sample age, handling or storage remains a practical concern during development, validation and implementation of autoantibody assays. This report does not draw conclusions about other logical applications for autoantibody profiling in lung cancer diagnosis and management, nor its potential when combined with other biomarkers that might improve overall predictive accuracy.

A panel of six autoantibody markers were used to assay samples from the Mayo Clinic CT screening trial, to gather normal distribution values, and generate a cutoff value that might be used to improve efficiency of lung cancer screening. Established cutoff values were applied to 285 samples from 95 participants of a regional CT screening study in the 5 th district of Kentucky (Appalachia). The primary objective of the study was to determine the ability of an autoantibody profile to detect lung cancers at the time of or before CT scan. The uniformity of sample collection and study entry criteria was an important standard for analysis within and between the two screening sample cohorts. Class prediction in sample sets comprised predominantly of occult lung cancers (prior to radiographic detection) is a unique aspect of this analysis. Accurate classification of stage I screening detected cancers was a secondary metric.
Samples were collected under protocols approved by accredited Institutional Review Boards (Mayo Clinic IRB and University of Kentucky IRB). All subjects provided written informed consent prior to any research procedures. This research was approved by respective IRBs and was conducted according to Institutional Review Board regulations and oversight.

Mayo cohort
The Mayo Lung Screening Trial performed five annual CTs on 1520 subjects with a minimum 20 pack-year smoking history, age 50-75, and no other malignancy within five years of study entry [16,17]. Cancer rates were 2.6% at 3 years rising to 4% at 5 years of screening. A single blood sample was drawn at study entry. The sample cohort was comprised of 180 non-cancer controls, six stage I prevalence lung cancers, and 44 lung cancers diagnosed 12 to 60 months from blood draw [16,17].

Kentucky cohort
The Marty Driesler Lung Screening Project was a communitybased CT screening study that accrued 254 at risk subjects from Eastern Kentucky between 2005 and 2008 [18]. Eligibility criteria included age 55 to 75 years, 30 pack-years history of smoking, and no other malignancy within five years of study entry. Cancer rate was 2.6%. All subjects provided written informed consent prior to any research procedures.
Since analysis of all available samples was cost prohibitive, a sample set of two hundred fifty six samples from ninety-five participants was constructed by an independent investigator and analyzed in a blinded fashion. The test cohort of nineteen lung cancer samples included five stage I screening detected lung cancers (three prevalence, two incidence), and four lung cancers diagnosed clinically one to five years after the last serial screening CT and corresponding blood sample. One case of head and neck cancer was diagnosed during the screening period, and six other non-thoracic malignancies were diagnosed up to five years from the last lung cancer screening CT. All cancer cases are summarized in Table 1. One or more non-malignant pulmonary nodules were noted in 56% of the study cohort. Dominant nonmalignant radiographic findings included emphysema, mediastinal adenopathy and granulomatous disease.

Assay composition and procedures
Marker discovery, measurement and statistical analysis has been described previously [7][8][9]. The marker panel was comprised of six individual tumor-associated autoantibodies that offered robust discrimination between cancer and noncancer samples in prior analysis; these six also provided consistent performance as a combined measure in a single assay based on receiver operating characteristic area under the curve. T7-phage-expressed capture proteins were derived from cDNA tumor libraries [7][8][9]. These putative autoantibody markers corresponded to apurinic/apyrimidinic endonuclease-1 (APEX1), nucleolar and coiled-body phosphoprotein 1 (NOLC1), splicing factor 3a (SF3A3), paxillin (PXN), BAC clone R-580E16 (unknown protein product) and mitochondrial 16S ribosomal RNA (MT-RNR2). [7,8 and unpublished] All phage-expressed capture proteins were covalently bound to Luminex microspheres for multiplex analysis using commercially available protocols. Autoantibody levels were quantified using biotinylated anti-human IgG and R-phycoerythrinlabeled streptavidin. The mean absolute fluorescence to each marker was calculated from triplicate measurements for each sample. No-sample controls included in each run consistently measured near zero.
A single absolute fluorescence value was generated for each sample using the sum from individual markers. A cutoff value of 640, corresponding to the lower quartile (set specificity at 25%), would be expected to maximize capacity for detecting cancer at the earliest stages of disease while still providing an improved the ratio of scans performed to cancers detected. That cutoff was applied to class prediction in the Kentucky CT screening cohort. Relevant points of data analysis included distribution in the at risk population and comparability to the Mayo Clinic cohort, consistency of annual measures from individual subjects, accurate classification of cancer samples at the time of and prior to radiographic detection.

Results
The additive sum of absolute fluorescence from six markers was used as an intuitive measure of overall autoantibody reactivity to provide a single value point for each sample, define distribution in the at risk population, and assign cutoffs for cancer prediction in an independent cohort. The median value across 180 non-cancer samples from the Mayo Clinic sample cohort was 1126 fluorescent units (FU), with 25%/75% quartile values of 640 and 2076 FU respectively; there was one extreme outlier. A cutoff of 640 fluorescent units offered 88% sensitivity across fifty cancer samples in the Mayo cohort, which included accurate classification of 6/6 established stage I cancers and 38/44 samples drawn one to five years prior to radiographic appearance. By comparison the median value across 237 non-cancer samples from the Kentucky cohort was 726 fluorescent units (FU), with 25%/75% quartile values of 461 and 1249 FU respectively, which is roughly one third lower than measured in the Mayo Clinic sample cohort. A contingency chart (table 2) shows class prediction in the Kentucky cohort at the predetermined cutoff of 640 FU, and also bares the effect of inflated cutoff values on sensitivity and specificity that resulted from the discrepancy between the training and testing cohorts. The cutoff of 640 FU accurately classified 102/237 nonlung cancer samples (43%) and 11/19 cancer samples (58%), which included 4/5 stage I lung cancers (80%), and 7/14 of occult cancer samples (50%) one to five years prior to radiographic appearance. Class prediction and temporal relationship of sample draw to cancer diagnosis is summarized in table 1.
Squamous and adenocarcinoma histologies were both represented among the true positives; there was nothing uniquely apparent about false negative samples. Other cancers accounted for 13/135 false positive measures ( Table 1). Six of the seven independently diagnosed non-thoracic malignancies in the KY cohort measured positively in one or more annual samples. The single highest value was a subject lost to follow-up after prevalence screening who was diagnosed with extranodal marginal zone Bcell lymphoma (MALT) five years after enrollment. Benign intrathoracic findings were common to subjects with false positive and true negative measures. The majority of false positives represented persistent elevations across serial screening cycles. Among the 130 false positive samples (.640 FU) in subjects with at least two annual samples, only six (4.6%) were singular events within the series of two or more annual measures.

Discussion
Primary objectives were to confirm the principles and precepts of autoantibody profiling and assess the potential of an autoantibody profile to increase efficiency and diagnostic accuracy of screening CT. Samples from the Mayo Clinic CT screening trial were used to define range and distribution of a composite measure within a screening population, and assign a cutoff value that would allow maximum sensitivity for lung cancers at and below the detectable limits of CT scanning. Distribution measures and relative cutoffs for cancer detection were tested in an independent screening cohort from the 5 th district of Kentucky. A cutoff set on the lower quartile of 180 noncancer controls in the Mayo cohort provided reliable detection of established stage I cancers and capacity to detect a percentage of incidence cancers prior to radiographic appearance in both cohorts. Observed frequency of serially positive and serially negative values across annual repeats in the Kentucky screening cohort suggests that autoantibody levels have a specific biologic basis even when there is no clinically apparent significance to the measure. The assay does not appear specific for lung cancer, although the variety of non-thoracic malignancies precludes any conclusion about histologic specificity. Inflated cutoff values that resulted from the notable discrepancy in the quartile distributions between the two cohorts skewed the dynamic range towards specificity in the Kentucky cohort. Although demographics, differences in eligibility criteria of the two studies and numerous independent clinical variables could account for this discrepancy, neither cohort is adequately sized for multivariable stratification. Conversely, observed differences in two independent but uniformly collected, moderately large and relatively comparable sample sets point strongly to sample age, processing, handling and/or storage as a source of preclinical error. Specifically, distribution analysis and assignment of cutoff values based on archived samples from two high-risk cohorts seems likely to have identified a biological effect that might not have been recognized with alternate study designs. Despite the presumption that autoantibodies are resilient biomarkers, there is a paucity of data on the consistency of autoantibody measures under various storage conditions and durations. Albeit limited, literature indicates serum antibody levels increase in cryopreserved samples over years of storage, possibly related to antigen-antibody complex dissociation and protein degradation [19,20]. Importantly, the current data shows how the validation process can be encumbered by variables unique to archived sample sets, which must be considered when transitioning from laboratory-based analysis to implementation in population-based applications.
Even when given allowance for quantifiable preclinical error and the effect of inflated cutoff values on predictive accuracy in the validation set, the data discourage more advanced validation. The appeal of detecting occult disease with lead-time advantage over CT scanning is tempered by excessive false negative rates, and certainly restricts this assay's utility in selecting individuals that most warrant serial imaging [5]. Interpretation of positive measures is further confounded by the apparent lack of specificity for thoracic malignancy.
Provisional assessment of the small number of radiographically detectable cancers in post hoc analysis approximates that of autoantibody profiles independently validated by other groups testing for established cancers [21][22][23]. If by extension we assume the best achievable sensitivity for stage I cancer is 80%, with a corresponding specificity of 40% expanding our analysis to sample sets with larger number of established cancers does not seem warranted. Also similar to other assays in the literature, a provisional sensitivity of 40% for established disease corresponds to specificity .90% [21][22][23][24]. Adjusting the cutoff for high specificity seems only to further deviate from a conventional screening paradigm. If used to further stratify cases by probability of cancer, however, a cutoff that favors specificity could mitigate inter-reader variability and reduce the number of false negative readings on screening CT scans [25,26]. A highly specific assay might also help discriminate benign from malignant nodules identified during screening, even though predictive value will be compromised by the promiscuity of the assay for both occult and radiographically apparent disease [27]. In summary, this report does not draw conclusions about future utility of this approach, but this validation study does not seem to support use of this assay as a primary population-based screening tool. Combining additional investigation with knowledge of this assay's performance may identify other logical areas for autoantibody profiling in lung cancer diagnosis and management.