The technical reliability and biotemporal stability of cerebrospinal fluid biomarkers for profiling multiple pathophysiologies in Alzheimer’s disease

Objective Alzheimer’s disease (AD) is a complex neurodegenerative disease driven by multiple interacting pathophysiological processes that ultimately results in synaptic loss, neuronal death, and dementia. We implemented a fit-for-purpose modeled approach to qualify a broad selection of commercially available immunoassays and evaluate the biotemporal stability of analytes across five pathophysiological domains of interest in AD, including core amyloid-β (Aβ) and tau AD biomarkers, neurodegeneration, inflammation/immune modulation, neurovascular injury, and metabolism/oxidative stress. Methods Paired baseline and eight-week CSFs from twenty participants in a clinical drug trial for mild cognitive impairment (MCI) or mild dementia due to AD were used to evaluate sensitivity, intra-assay precision, inter-assay replicability, and eight-week biotemporal stability for sixty unique analytes measured with commercially available single- and multi-plex ELISA assays. Coefficients of variation (CV) were calculated, and intraclass correlation and Wilcoxon signed rank tests were applied. Results We identified 32 biomarker candidates with good to excellent performance characteristics according to assay technical performance and CSF analyte biotemporal stability cut-off criteria. These included: 1) the core AD biomarkers Aβ1–42, Aβ1–40, Aβ1–38, and total tau; 2) non-Aβ, non-tau neurodegeneration markers NfL and FABP3; 3) inflammation/immune modulation markers IL-6, IL-7, IL-8, IL-12/23p40, IL-15, IL-16, MCP-1, MDC, MIP-1β, and YKL-40; 4) neurovascular markers Flt-1, ICAM-1, MMP-1, MMP-2, MMP-3, MMP-10, PlGF, VCAM-1, VEGF, VEGF-C, and VEGF-D; and 5) metabolism/oxidative stress markers 24-OHC, adiponectin, leptin, soluble insulin receptor, and 8-OHdG. Conclusions Assays for these CSF analytes demonstrate consistent sensitivity, reliability, and biotemporally stability for use in a multiple pathophysiological CSF biomarker panel to profile AD. Their qualification enables further investigation for use in AD diagnosis, staging and progression, disease mechanism profiling, and clinical trials.


Introduction
Alzheimer's disease (AD) and related disorders (including Lewy body dementias, frontotemporal lobar degenerations, and vascular cognitive impairment and dementias) are complex neurodegenerative diseases driven by vicious pathophysiological cycles of protein misfolding [1], inflammation [2,3], neurovascular dysfunction [4], oxidative injury [5,6], and disruption of metabolic pathways [7,8]. To varying degrees, these promote excitotoxicity [9], autophagy [10], apoptosis [11], and necrosis [12], with resultant synaptic loss and neuronal death [13,14]. Progressive dementia of insidious onset is the clinical manifestation of these processes that evolve over years to decades prior to expression of clinical symptoms [15]. While the core amyloid-β (Aβ) and tau biomarkers of AD [16,17] are highly associated with the presence of signature plaque and tangle pathological hallmarks of AD in the brain [16], they do not assess other fundamental biochemical aspects of the disease, such as non-amyloid and non-tau neurodegeneration or metabolic, immune, and neurovascular dysfunction [18][19][20][21]. A panel of robust biomarkers that directly reflect these concomitant pathophysiologies may prove a better indicator of disease diagnosis [22,23], subtypes, staging, and activity, as well as treatment-specific target engagement.
As clinical trials continue to move towards pre-symptomatic individuals and disease prevention, it is critical to establish biofluid markers that may sensitively detect biochemical changes associated with disease onset and the intersecting pathophysiologies that drive disease progression [24,25]. A panel of analytes that represent multiple facets of AD pathology and pathophysiology, as opposed to a single marker, may offer increased specificity and sensitivity in diagnosis [22,23], but more importantly, may also profile AD subtypes to enable precision medicine. Current clinical trials for AD focus on disease modification with investigational new agents that target general neuroprotection, neuroinflammation, metabolic and oxidative dysfunction, and neurovascular injury. Determining how active each of these pathophysiologies are in a given patient with pathway-oriented biomarkers may help guide choice of therapy, and would also be important for demonstrating and monitoring the effects of pharmacodynamic target engagement.
A key consideration for the development of any biomarker is the reproducibility of results within and across times and laboratories. While the core AD biomarkers have been extensively evaluated in terms of precision, pre-analytical factors affecting measures, and utility in clinical research and care [21,[26][27][28][29], validations for newer candidate cerebrospinal fluid (CSF) biomarkers are still in an early phase. Of particular importance for clinical trials is confidence in the stability of biomarkers over short-term repeat collections from the same individual, independent of disease-related pathophysiological changes [30,31]. Highly dynamic analytes whose levels fluctuate widely from day to day as a result of diet, restless sleep, or other biorhythms or environmental influences, would not prove reliable in a clinical trial where a limited number of samples are collected over long periods of time. Like technical precision of an assay, this "biotemporal stability" of a biomarker needs to be considered in sample size determinations and study design so the measurements reflect true disease progression or drug effect as opposed to random noise. Low baseline variability over brief periods of time, as has been established with the core AD biomarkers [32][33][34][35][36], increases the value of a CSF biomarker as a routine measurement in a clinical setting, suggesting that it may be sensitive enough to detect disease-related differences, disease progression over longer periods of time, or biochemical changes in response to intervention [31]. While there are a few reports describing short-term intra-individual variation of AD-relevant biomarkers in serum [31,37] and over longer intervals (annual) in CSF [35], stability over shorter intervals in CSF is rarely investigated or reported. Opportunities to conduct such analyses are scarce, as multiple lumbar punctures within short timeframes are seldom performed, even in the context of clinical research.
We implemented a fit-for-purpose modeled approach for evaluating and qualifying a broad selection of commercially available immunoassays for exploratory use with CSF. The assays represent five key domains of AD pathophysiology: 1) core AD amyloid-β and total tau biomarkers, 2) non-Aβ and non-tau neurodegeneration, 3) inflammation and immune modulation, 4) neurovascular markers, and 5) metabolism and oxidative stress. We identified 32 biomarker candidates with excellent performance characteristics by assessing technical assay reliability and biotemporal intra-individual variation of each biomarker in CSFs collected at two timepoints over an eight-week interval from individuals with mild cognitive impairment (MCI) or mild dementia due to AD. These candidate analytes may be used to develop a broad, practical biomarker panel that simultaneously portrays the diverse pathophysiological processes involved in AD and related disorders. These rigorously validated analytes should perform consistently and reliably in profiling the complex pathophysiology of AD and monitoring changes during disease progression and intervention, ultimately enabling clinical trials and allowing personalized treatment.

Study participants
CSFs were obtained at baseline and after 8 weeks as part of a pilot randomized placebo-controlled clinical trial investigating the effects of the drug metformin (versus placebo) in MCI/ AD (NCT01965756) [38]. All subjects provided written informed consent for participation and use of CSF in future research in Human Subjects Institutional Review Board (IRB)approved protocols at the University of Pennsylvania. All subjects had a clinical diagnosis of amnestic MCI or mild dementia due to AD. Demographic and clinical characteristics are presented in Table 1 [41] total < 6 to exclude concomitant depression, Modified Hachinski Ischemic Scale [42] score < 4 to exclude subjects with potential vascular etiology to their cognitive complaints, fasting blood glucose < 110 or HgbA1c < 6.0 to exclude subjects with diabetes or prediabetes, and at least one positive biomarker consistent with AD (i.e. previous Aβ CSF, fluorodeoxyglucose or Aβ positron emission tomography, or volumetric MRI). Individuals on an acetylcholinesterase inhibitor were required to be on a stable dose for at least 2 months prior to screening. Based on prerequisites for the intervention under investigation, potential subjects were ineligible if they had past or current diabetes or renal disease, evidence of infarcts, focal intracranial lesions or other neurodegenerative conditions, or unstable medical or psychiatric illness. A total of 20 participants were enrolled: 9 women and 11 men, with a mean age of 70.1 years. APOE ε4 carrier status was inferred following the conclusion of the study using a validated immunoassay (K4699, BioVision) to measure Apolipoprotein E (ApoE) ε4 in plasma according to manufacturer's instructions.

Cerebrospinal fluid collection and measurement overview
CSF samples were collected by lumbar puncture (LP) at baseline and again at 8 weeks with adherence to ADNI protocol (http://www.adni-info.org/). The procedure was performed between 8 a.m. and 10 a.m. in all participants to minimize effects of diurnal variation. Using 24-gauge Sprotte 1 atraumatic needles, 20 mL of CSF was collected into polypropylene syringes according to standard procedure. Within 30 minutes of collection, CSF was aliquoted into 0.5 mL polypropylene tubes, bar-coded, frozen, and stored at -80˚C for subsequent analysis.
CSFs were tested in duplicate in each of three experimental blocks to assess the intra-and inter-plate reliability of candidate assays. Repeat CSFs from 9 subjects in the 8-week placebo arm were used in paired assays and tested in each block to determine short-term intra-individual variation. After initial data inspection, samples from one participant in the metformin group were disregarded due to hemolytic contamination, resulting in artificially elevated analyte concentrations. Assays were conducted in the Arnold Lab at the Massachusetts General Hospital Institute for Neurodegenerative Diseases (MIND). In addition, amyloid-β peptide 1-42 (Aβ 1-42 ), total tau (tTau), and phospho-tau (pTau) data were available from prior testing in these samples at the ADNI Biomarker Core / Shaw Lab at the University of Pennsylvania [38].

Biochemical procedures
The performance characteristics of 60 unique potential biomarker analytes were evaluated in twenty single or multi-plexed panel kits ( Table 2). CSF concentrations of 54 analytes were examined using the Meso-Scale Discovery (MSD) platform with commercially available simplex and multi-plex electrochemiluminescent (ECL) immunoassays (Meso-Scale Diagnostics, LLC, Rockville, MD). Along with replicating the core AD biomarkers Aβ 1-42 and tTau, we tested other potential analytes in the pathophysiological domains of neurodegeneration, metabolism and oxidative damage, neurovascular injury, and inflammation/immune modulation. Fourteen MSD assay panels for biomarkers of interest were chosen based on a number of factors including a priori pathophysiological relevance, assay availability, previous assay use in the literature, specific reported features of concentration sensitivity and range, previous use in CSF (if any) and previous findings in AD. An additional 6 colorimetric ELISA kits were used to measure markers not available through MSD. All kits were purchased in bulk to minimize lot-to-lot variability. Assays were performed according to manufacturer's specifications and samples diluted per assay requirements.   (MRD) prior to final analysis (designated with a † in Table 2). For many of these assays, CSF samples were measured neat to allow concentrations to fall within a detectable range. The plate design scheme for sample wells and reliability measures is shown in Fig 1. Each plate contained the entire cohort of CSF samples, and all paired CSF samples per subject were included on the same assay in adjacent wells. This design ensured that plate-to-plate variability would not compromise biotemporal analysis of analyte concentrations between paired repeatcollection CSFs. Each CSF sample was assayed in duplicate per plate, and each plate experiment was replicated three times. A different 0.5 mL aliquot of sample was used for each assay replicate. Serially diluted standard curves and spiked controls were included on every plate and measured in duplicate.

Neuroinflammatory and vascular injury assays
Numerical data obtained on the MSD platform were generated using the MSD Discovery Workbench1 4.0 software. For colorimetric ELISA kits, absorbance values were collected using a microplate reader (Victor 2 Multi-Label Microplate Reader, Perkin Elmer-Wallac) and sample concentrations manually calculated against the standard curve using a 4-Parameter Logistic Regression.
A number of assays examined here had not been reported previously in CSF. Consistent detection of an analyte in plasma does not ensure detection in CSF, as the protein may not be generated in the CNS, its levels may be altered by blood brain barrier (BBB) selectivity, and/or its detection may be compromised by matrix effects. To verify whether each assay had sufficient sensitivity to detect an analyte(s) in CSF, we compared sample concentrations of analytes to the assay's lower limit of detection (LLOD), specific to each individual plate and analyte. The LLOD was calculated by the Discovery Workbench 4.0 software as 2.5 standard deviations above the background signal. For colorimetric assays, the published LLOD was used as the assay's sensitivity threshold. A biomarker assay was considered to be satisfactorily sensitive if it was consistently measured above the assay LLOD in > 80% of samples across all plates.
For use in any comparative study, it is critical that repeat measures are technically reliable. Moreover, to detect change with treatment in the context of typically slow disease progression, CSF levels should be biologically consistent and not fluctuate widely over short periods of time. An experimental scheme was designed to assess all forms of assay stability for analytes that passed LLOD criteria: intra-assay, inter-assay, and biotemporal stability (Fig 1).
Intra-assay reliability was assessed by calculating the median coefficient of variation (CV), or relative standard deviation, for duplicate sample concentrations within each of the 3 plates. Inter-assay reliability was assessed by calculating CV values for duplicate samples across three experimental blocks and tested statistically using intra-class correlation coefficients (ICCs) [45,46]. ICC estimates and 95% Confidence Intervals (CIs) were calculated using SPSS Statistics 24.0 based on ICC(3,3), a mean-rating, absolute agreement, 2-way mixed-effects model (recommended in [47]). Mean normalization was first applied to each plate to correct for day-to-day technical variation, before calculation of the CV. Analytes with both intra-and inter-assay CVs below 15% were considered to have acceptable technical reliability. Data were natural-logarithm transformed prior to ICC analysis. ICCs were interpreted as follows: ICCs > 0.75 indicated strong reliability, ICCs between 0.5-0.75 reflected moderate to good reliability, and ICCs < 0.5 indicated poor reliability [48].
Samples from subjects in the placebo group were used for the intra-individual biotemporal variation assessment of each analyte that passed the technical precision criteria. Paired CSFs collected at baseline and after the first treatment block (8 weeks apart) were used to calculate CV values representing the biotemporal stability of analytes. The mean concentrations for each sample across the three experimental blocks was used for analysis. ICCs were not calculated for biotemporal variation as the sample size was not sufficient to satisfy test conditions. Similarly, given that analyte distribution could not be assumed to be normal, a Wilcoxon Signed-Ranks Test was used to assess whether there were significant differences between CSF collection timepoints. Finally, an analyte with a calculated biotemporal CV below 15% was considered to have low baseline variability.

Assay sensitivity for analytes in CSF
A total of 35 CSF samples collected from twenty study participants were included in this analysis. Five samples were unavailable due to an unsuccessful lumbar puncture or subject refusal. Assay sensitivity was assessed for twenty single-plex and multi-plex assays used in this study ( Table 2). Percentages of sample measurements with concentrations falling above the LLOD were calculated using concentrations from all sample replicates across the three experimental blocks. In total, 105 measurements per biomarker were taken into consideration to determine whether the target protein could be reliably detected and measured in CSF (S1 Table and S1 Fig).
Four of the ten metabolic markers were measurable in all CSF samples: 24S-hydroxycholesterol (24-OHC), adiponectin, soluble insulin receptor (sIR), and 8-hydroxydeoxy guanosine (8-OHdG). Leptin was measurable in 84.29% (59/70) of included samples. Active glucagon like protein-1 (GLP-1), glucagon, and carboxymethyl lysine (CML) were not detectable in any CSF samples using the applied methods. Insulin was not considered measurable in CSF (only 54.29% of sample concentrations fell above the LLOD), but because of interest in this metabolic hormone in AD [7,8] we pursued it with three alternative commercial assays, though none yielded better results. The VGF nerve growth factor inducible (VGF) immunoassay was not reproducible in our hands and was discontinued (S1 Table).

Technical precision: Intra-assay reliability
Analyses of intra-assay reliability were performed only if biomarker candidates were measurable in > 80% of cases in the initial sensitivity screening. Analytes with marginal concentrations in CSF might be expected to exhibit poor precision and reliability due to their proximity to the LLOD [49]. Therefore, insulin, active GLP-1, glucagon, CML, MCP-4, IL-17A, IL-1α, TNF-β, IL-1β, IL-2, IL-4, IL-13, and Tie-2 were excluded from subsequent analyses.

Technical replicability: Inter-assay reliability
Inter-assay reliability was assessed using ICC (3,3) taking into account the mean concentrations of each sample (n = 35) measured across three experimental blocks. The overall interassay CV for each biomarker candidate was evaluated by calculating the CV between all technical replicates across the three experimental blocks after applying a mean-normalization adjustment to sample concentrations on repeated assays. Both analytical methods were used to stringently identify candidates with the best overall inter-assay replicability (i.e. exhibiting both ICC reliability > 0.5 and an inter-assay CV < 15%). Failure to meet performance criteria under either analysis served as an indication of poor inter-assay replicability.
Raw data for all analytes included in the study are provided in S3 Table.

Discussion
We performed comprehensive fit-for-purpose testing of commercially available immunoassays in CSF and identified a broad selection of reliably measured, biotemporally stable markers that represent multiple pathophysiological domains of interest in AD. In examining both technical reliability and biotemporal stability in CSF (S2 Fig), we established baseline variability in analyte measures, a prerequisite for the design and interpretation of biomarker measures in studies of differential diagnosis, disease staging, and tracking of longitudinal change with disease progression or intervention response. Biotemporal stability is often overlooked as a measure of assay reliability, but is important as metabolic, vascular, inflammatory, and other markers may fluctuate or otherwise vary over  Table 3. Intra-individual biostability of analyte measurements between repeat-collected CSFs.

Analyte
Mean baseline concentration (SD)

Median biotemporal CV, % (IQR) Exact p-value (2-tailed)
Core AD biomarkers Aβ time. Analyte levels might be affected by circadian or seasonal rhythms, diet, environmental stressors, and intercurrent health issues. Such factors may compromise the comparative utility of these biomarkers, underscoring the importance of determining intra-individual variation prior to their inclusion as biomarkers in longitudinal studies or clinical trials. Determining baseline fluctuation of analytes also has statistical implications for the interpretation of both cross-sectional and longitudinal data, and can inform study design and sample size necessary to reveal significant differences in the primary outcomes of clinical trials. From our initial screening of 60 unique protein/molecular targets, we identified 32 promising markers that were measurable in CSF, passed our technical reliability performance criteria of better than 15% CV, and presented low intra-individual variation between repeat collected samples. The panel of high performing candidate analytes we identified represent important pathophysiological domains in AD-including neurodegeneration, inflammation/immune modulation, neurovascular injury, metabolism and oxidative stress-allowing for multi-pathway profiling of disease state. Their performance criteria would qualify these assays in research, though better performance would be necessary for clinical use. In addition, the multiplexed nature of these assays requires that more testing be performed if they are to be used in long-term studies, when different "lots" of assay kits might be used over time. Multiplexed assays are more subject to lot to lot variability, cross reactivity issues, and matrix effects than single-plex assays, and as a result should be assessed for spike recovery and parallelism on a lot-to-lot basis [50].
Each of the core AD CSF biomarkers tested (tTau, Aβ 1-38 , Aβ 1-40 , and Aβ 1-42 ) performed well in every measure, meeting the fit-for-purpose standards of sensitivity and stability for the MSD assays. These results were in agreement with published data evaluating the analytic performance of these proteins on this platform [51]. Of the non-core analytes, the non-Aβ and non-tau neurodegeneration markers NfL and FABP3 were also remarkably stable over a twomonth interval, with comparable analytic performance to the classical AD markers. NfL is considered to be a marker of damage to large myelinated axons, and FABP3 is an abundant cytoplasmic protein that is thought to participate in the uptake, intracellular metabolism, and/or transport of long-chain fatty acids, playing a role in composition of lipid membranes [52][53][54]. NfL and FABP3 have both been reported to correlate with tTau [18,55] and/or Aβ [56]. However, neither is thought to be disease specific, but rather general markers of neurodegeneration [54,[57][58][59]. They may be useful as complementary biomarkers for staging AD or perhaps quantifying the amount or degree of active neurodegeneration at the time of the sample.
Abnormalities in inflammatory cytokines, chemokines, and immune modulators have been reported in AD, but replications have been less consistent. Among the more robust has been the secreted glycoprotein YKL-40 [18,[60][61][62][63], while data are less consistent on others such as monocyte chemokine MCP-1 [64,65] and proinflammatory cytokine IL-6 [66]. The value of these inflammatory analytes as biomarkers in the diagnosis, prognosis, or progression of AD/ ADRD remains to be seen, but we can confirm the reliability and stability of their measurement for this purpose. In addition, we found that seven other inflammatory/immune molecules, not heretofore reported in CSF, also demonstrated good performance characteristics, including the interleukins IL-7, IL-12/23p40, IL-15, IL-16, and IL-8, MDC and MIP-1β. The role of neurovascular injury has been of increasing interest in AD, but relevant biomarkers are not well established. Some of the more commonly reported include adhesion molecules ICAM-1 and VCAM-1 [67,68]. We identified eleven potential biomarkers ( Table 2) of relevance to neurovascular injury that were reliably measurable in CSF and exhibited good biotemporal stability. Having established a method of reliably replicating the measurement of these markers will assist future studies investigating their roles in the pathogenesis of AD/ ADRDs.
Many markers of metabolism and oxidative stress are reported to be present at very low concentrations in CSF compared to plasma or serum [69][70][71]. We were especially interested in measuring insulin levels in CSF, as insulin resistance is of great current interest in the field [7,72] and insulin levels have previously been reported to be altered in the CSF of AD patients [73]. Using MSD and three additional commercial assays for insulin we found very low concentrations in CSF, at or below the lower limits of detection. Further work, perhaps with new, ultrasensitive single molecule array assays will be needed to reliably measure this important metabolic hormone in CSF for AD studies.
Of the initial ten metabolism/oxidative stress biomarkers we surveyed, only five were reliably measurable in CSF, including sIR, adipokines adiponectin and leptin, the brain specific cholesterol metabolite 24-OHC, and oxidative stress related DNA modification 8-OHdG. Abnormalities in essential metabolic pathways, including trophic and metabolic signaling and regulation and cholesterol trafficking and turnover, have been reported in AD and associated with disease severity [74][75][76][77][78]. The relatively few reports prompt interest, and we hope the validation data we provide enable further study in this important area.
In conclusion, we have identified a pathophysiologically diverse set of CSF biomarkers that demonstrate consistent, reliable, and biotemporally stable quantification, establishing their potential for use in exploratory studies of AD. Subsequent work will assess the value of these high performing assays for inclusion in a practical biomarker panel for multi-dimensional molecular profiling in dementia as a tool for diagnosis and staging as well as assessing disease mechanisms, novel therapeutic target engagement of drugs or other biological interventions, and response in clinical trials. We conjecture that analyte profiles will differ among patients to varying degrees, thus allowing a personalized profile that may suggest a personalized treatment or prevention strategy.