An Evaluation of Commercial Fluorescent Bead-Based Luminex Cytokine Assays

The recent introduction of fluorescent bead-based technology, allowing the measurement of multiples analytes in a single 25–50 µl sample has revolutionized the study of cytokine responses. However, such multiplex approaches may compromise the ability of these assays to accurately measure actual cytokine levels. This study evaluates the performance of three commercially available multiplex cytokine fluorescent bead-based immunoassays (Bio-Rad's Cytokine 17-plex kit; LINCO Inc's 29-plex kit; and RnD System's Fluorokine-Multi Analyte Profiling (MAP) base kit A and B). The LINCO Inc kit was found to be the most sensitive assay for measuring concentrations of multiple recombinant cytokines in samples that had been spiked with serial dilutions of the standard provided by the manufacturer, followed respectively by the RnD Fluorokine-(MAP) and Bio-Rad 17-plex kits. A positive correlation was found in the levels of IFN-γ measured in antigen stimulated whole blood culture supernatants by the LINCO Inc 29-plex, RnD Fluorokine-(MAP) and RnD system IFN-γ Quantikine ELISA kits across a panel of controls and stimulated samples. Researchers should take the limitation of such multiplexed assays into account when planning experiments and the most appropriate use for these tests may currently be as screening tools for the selection of promising markers for analysis by more sensitive techniques.


Introduction
Cytokines are important modulators of immune response pathways [7,18,19]. Cytokine expression profiling (CEP) has become a popular and established method for the identification and characterisation of disease-associated immune responses [4,6,8,9]. Previously, CEP was a laborious process requiring substantial sample volumes when multiple cytokines were under investigation. However, CEP methodology has been revolutionised by the recent introduction of fluorescent bead-based luminex technology, a capture/detection sandwich type immunoassay allowing the measurement of up to 100 different analytes in a single 50 ml sample [16].
The reduced sample volume and time-saving advantages of the luminex system have made it an attractive method for large-scale cross-sectional, association or cohort studies which investigate the host immune response [1,13,15,16]. Khan et al. [11] have comparatively assessed multiplex kits from LINCO Research, Bio-Rad Laboratories, RnD Systems and Biosource International and compared them to an enzyme-linked immunosorbant assay (ELISA). The comparison was based on the measurement of a sample of five cytokines (serum samples from healthy individuals intravenously injected with endotoxin). They reported that the cytokine concentrations, as measured by the different kits, showed similar trends, although the absolute concentrations measured were different.
There are also a number of reports validating luminex systems. These studies often used kits with a narrow panel of cytokines. The present study not only has the advantage of combining a head-tohead comparison of different kits and assays on their ability to measure cytokine levels in blood samples, but is also the first independent study to comparatively assess the recovery of each cytokine by the commercially available Bio-Rad 17-plex, LINCO 29-plex and RnD Fluorokine-(MAP) base kits A and B (13 cytokines total) with reference to instrument settings and calibration.

Study design
This study followed an integrated methodology, comparing 3 commercially available multi-plex luminex kits (Cytokine 17-plex kit by Bio-Rad; a 29-plex kit by LINCO Research; and Fluorokine-Multi Analytes Profiling (MAP) kit) by RnD System as well as the RnD Systems IFN-c Quantikine ELISA kit. We have used the following two approaches: 1) Measurement of recombinant cytokines in serum and in unstimulated diluted whole blood culture supernatant samples, each spiked with serial dilutions of the multiplex standard provided by the luminex kit manufacturer in order to calculate the recovery (accuracy) of the assay for each of the different cytokines; 2) Measurement of native, induced IFN-c, in vitro, whole blood culture supernatants where, whole blood culture supernatants were stimulated with Mycobacterium tuberculosis (Mtb) antigens or Bacille Calmette Guerin (BCG).

Definitions
Recovery. Ratio of the observed amount of cytokine compared to the expected known amount of cytokine in a sample, expressed as a percentage. An acceptable recovery falls within the range of 70-130% (Bio-Rad Principles of Curve Fitting for Multiplex Sandwich Immunoassays, Rev B).
The following formula was used to calculate recovery: Observed Concentration in spiked sample{Observed Concentration in unspiked sample ð Þ|100 Expected concentration amount of recombinant cytokine used to spike sample ð Þ The definition and calculations of recovery were obtained from the RnD-Systems spike and recovery immunoassay sample validation protocol.
Reading. Reported fluorescence of the sample Positive reading. Reported fluorescence of a sample that is above background fluorescence and corresponds to a positive concentration.
RP1. RP1 represents the fluorescence channel used for assay quantification. Low RP1 is the fluorescent channel recommended for quantification of a wide range of cytokines with a wide dynamic range of concentrations; whereas high RP1 is recommended for quantification of low concentrations of cytokines as it provides greater sensitivity.
5 PL-(parameters logistic) Regression Curve. A standard curve build upon a five parameters logistic equation and that corrects for asymmetry in the curve shape.
Manufacturer 1. Bio-Rad Manufacturer 2. LINCO Research Manufacturer 3. RnD Systems Manufacture 1 assay Experiment 1. The Bio-Rad human cytokine 17-plex assay was carried out according to the manufacturer's instructions, with a few exceptions as stipulated below. Briefly, a nine-point standard curve was generated by performing serial dilutions of the reconstituted normalised standard (lot # 5004060). This was not reconstituted with standard diluent, but rather with unstimulated whole blood assay supernatant, which was prepared by diluting whole blood one in ten with RPMI-1640 (GIBCO) and incubating at 37uC, 5% CO 2 for six days. This was done in order to ensure that the matrix used in the generation of the standard curve resembled that of the samples as closely as possible as preliminary test showed that this method was superior to dilution of standards in standard diluent (data not shown). In order to assess recovery, supernatant samples SN1, SN2 and SN3 were each spiked at three concentrations (ranging from 7-5461 pg/ml) with recombinant cytokine using the Bio-plex kit standard. The assays were run in duplicate, which produced in total six concentration replicates. In order to keep the matrix of the spiked samples as similar as possible to the matrix of the standard curve, the volume of reconstituted standard used to spike the samples in all experiments was kept to a minimum and never exceeded 10 ml. A 50 ml volume of each sample, control or standard was added to a 96 well plate (provided with the kit) containing 50 ml of antibody coated fluorescent beads. Biotinylated secondary and streptavidin-PE antibodies were added to the plate with alternate incubation and washing steps. After the last wash step, 125 ml of wash buffer was added to the wells, the plate incubated and read on the Bio-plex array reader, using a 5 PL regression curve to plot the standard curve. Samples and controls were read at both a low RP1 target setting (used to maximize assay sensitivity when the expected concentrations are below 3 200 pg/ml) and a high RP1 target setting (used for broad range concentrations) on the Bio-plex suspension array using a high throughput fluidics (HTF) system (cat# 171000005). Data was subsequently analysed using the Bioplex manager software, version 3. Experiment 2. The standard curve was generated and the assay conducted as described above in the section on the Bio-Rad human 17-plex assay experiment 1. Two sets of samples were used. The first set was generated using whole blood from a healthy laboratory donor diluted one in ten with RPMI-1640 with glutamax and stimulated with different Mtb antigens, (generously donated by Tom Ottenhoff, Leiden University), and a phytohaemagglutinin (PHA)-stimulated positive control. Unstimulated culture supernatant served as a negative control. The second set of samples was generated from unstimulated whole blood culture as described above. Supernatants were harvested on day seven and spiked at five different concentrations with recombinant cytokine from the Bio-Rad standard (lot # 5004060. The six concentrations at which samples were spiked were unique for each cytokine with the lowest spike ranging from 2-43 pg/ml and the highest from 1191-8062 pg/ml). The results of these tests were used to calculate recovery.

Manufacturer 2 assay
Experiment 1. A human 29-plex LINCO assay (cat no HCYTO -60-K-PMX29) was done according to manufacturer's instructions. Briefly, a standard curve ranging from 3.2 pg/ml to 10 000 pg/ml was generated by serial dilution of reconstituted standard. Two sets of samples were used, as described earlier, with the exception that for the second set of samples the LINCO Research standard (provided with the kit) was used to spike unstimulated whole blood culture at final concentrations of 5000, 1000, 500, 50 and 10 pg/ml. Additionally for the assessment of the LINCO kit reproducibility five aliquots of the same PHA-stimulated whole blood supernatant were produced and each aliquot was run in five different experiments on different days. Briefly the filter plates were blocked by pipetting 200 ml of assay buffer into each well. After 10 minutes the assay buffer was discarded by vacuum aspiration and 25 ml of assay diluent was added to the wells designated for the samples, while 25 ml of RPMI-1640 with glutamax (GIBCO) was added to the wells designated for standards. According to the plate layout, 25 ml of either standard or sample was then added to the appropriate wells after which 25 ml of antibody coated fluorescent beads was added. Biotinylated secondary (detection) and Streptavidin-PE-labelled antibodies were then added to the plate respectively, with alternate incubation and washing steps. Finally 100 ml of sheath fluid was added to the wells and the plate read immediately on the Bio-plex array reader, at high and low RP1 targets, using a 5 PL regression curve. Experiment 2. A repeat of the experiment 1 was done by measuring 21 cytokines in two sets of samples including 171 Mtb antigens stimulated culture supernatant and spiked unstimulated whole blood culture (as previously described). Only this time samples were spiked at 2 different concentrations. The Plate was read at low RP1 targets as previously described.
Manufacturer 3 assay Experiment 1. The assay was done according to the manufacturer's instructions. Briefly, the standard curves for the RnD System fluorokine-(MAP) human base kits A (cat # LUH000) and B (cat # LUH001) were generated by reconstitution of standards in standard diluent provided with the kit. Samples included the same set of antigen-stimulated whole blood culture supernatants used for the Bio-plex experiment 2 and LINCO 29-plex assays described earlier, as well as serum (diluted one in four) and whole blood supernatant spiked at six different concentrations with recombinant cytokine from the RnD System's standard (Part # 895531, lot # 238222 and Part # 895546, lot # 238223 [base kit A] and Part # 892794, lot # 233020 [base kit B]). The six concentrations at which samples were spiked were unique for each cytokine with the lowest spike ranging from 14-600 pg/ml and the highest from 950-19 000 pg/ml. An eightpoint standard curve, with each cytokine spanning its own unique specific range, was generated and 50 ml of each standard and sample were added to a 96-well plate containing fluorescent antibody coated beads. After alternate incubation and washing steps, detection and PE-labelled secondary antibodies were added and the plate read on the Bio-plex array reader, at a low RP1 target, using a 5 PL regression curve. Experiment 2. The assay was done according to the manufacturer's instructions. Serum (diluted one in four) and whole blood culture supernatant were spiked with seven different concentrations (ranging from 5-1700 pg/ml) of recombinant cytokine from the RnD System's standard and measured as previously described.

RnD-system Quantikine ELISA
The same Mtb antigen-and PHA-stimulated samples used for the Bio-Rad human 17-plex assay experiment 2, LINCO human 29-plex assay and RnD Systems fluorokine-(MAP) experiment 1 assay were also assessed by ELISA. The ELISA was done using the RnD Systems IFN-c Quantikine ELISA kit (cat# DIF50) according to the manufacturer's instructions. Briefly, lyophilised Quantikine standard was reconstituted in distilled water and serially diluted one in two in kit standard diluent to produce a seven-point standard curve ranging from 15.6 pg/ml to 1000 pg/ ml. Thereafter, 100 ml of assay diluent was added to the designated wells in a 96-well polystyrene microplate (provided with the kit) coated with polyclonal antibody against IFN-c, followed immediately by 100 ml of standard, sample or control. The standard curve, samples and controls were run in duplicate. The plate was incubated for two hours at room temperature, washed and thereafter 200 ml of horseradish peroxydase (HRP)conjugated IFN-c antibody followed by 200 ml of substrate solution was added to the wells, followed by another incubation period and washing step between the two additions. After Statistics: Manufacturer 1d, 2, 3 assays and ELISA comparison (Study 2) The correlation between the concentrations of cytokines as measured by the different Immunoassays for the same sample was assessed by the mean of intra-class correlation coefficients and the Pearson product-moment correlation coefficient. The analysis was done using STATISTICA (version 7).

Results
Manufacturer 1 assay Experiment 1. The recovery of the Bio-Rad human 17-plex was assessed using spiked whole blood culture supernatants from three healthy individuals and each of the supernatants was spiked at three different concentrations. Generally, a lack of accuracy was observed as illustrated in Table 1. At a high RP1 target, 21% of positive readings were in the recovery range of 70 to 130%, whereas only 12.4% were within that range when samples were read at a low RP1 target.
Experiment 2. In this study, recovery of the Bio-Rad human 17-plex was assessed for five different concentrations of individual cytokines. Fluorescence was read both at high and low RP1 targets, with 85 readings made for each RP1 target. A total of 65 readings out of 85 were positive for the low RP1 target, with 54% of these positive readings (41.2% of the total readings) falling within the acceptable recovery range of 70 to 130%. There were 62 positive readings out of 85 at the high RP1 target, with 61% of these (44.7% of the total readings) falling within the acceptable recovery range of 70 to 130%. The cytokines IFN-c, TNF-a, IL-1b, IL-2, IL-4, IL-6, IL-7, IL-12p70, IL-13, IL-17, GM-CSF and MCP-1 measured most accurately when samples were read at the high RP1 target, whereas measurements of IL-5, IL-10, G-CSF and MIP-1b showed better recoveries when samples were read at a low RP1 target. Interfering interactions in the samples presumably led to falsely increased and signal inhibition for IL-8 detection, which resulted in out-of-range readings for four out of the five assessed concentrations.
The recoveries of cytokines included in the Bio-Rad human 17plex panel are shown in Figure 1 (recoveries from two independent experiments). Detailed statistics on Bio-Rad human 17-plex recovery and variations are shown in Table 2.

Manufacturer 2 assay
Experiment 1. The recovery of the cytokines forming part of the LINCO-Inc 29-plex panel were assessed at five different concentrations (5 000, 1 000, 500, 50 and 10 pg/ml), and read at high and low RP1 targets. The test showed an acceptable performance. A total of 145 readings were made at each RP1 target and 123 of these fell within the detection range when read at high RP1, compared to 136 out of 145 when read at the low RP1 target. Of the positive readings, 78.4% (66.2% of the total   readings) read at the high RP1 target had recoveries falling within the acceptable range of 70 to 130%, whereas approximately 70% of positive readings (65.7% of the total readings) made at a low RP1 target achieved this acceptable recovery. Measurements of IFN-c, IL-1b, IL-4, IL-6, IL-7, IP-10, MCP-1 and G-CSF were found to be most accurate when the plate was read at a high RP1 target, with recoveries falling between 70 to 130%, whereas those for TNF-a, IL-1a, IL-1ra, IL-2, IL-5, IL-10, IL-12p40, IL-12p70, IL-13, Fractalkine, MIP-1a, MIP-1b, GM-CSF, TGF-a, sCD40L, VEGF, Eotaxine and EGF were most accurate when read at a low RP1 target. IL-15 and IL-17 showed similar recoveries at both high and low RP1 targets. The level of background signal was very high for IL-8; this was most probably due to de fact that whole blood supernatant used for the spiking experiment was not prediluted. The coefficients of variation between the measurement at high RP1 and low RP1 targets were less than 5% when applicable (when both high and low RP1 showed positive readings). Experiment 2. The assessment of LINCO -plex kit was repeated only this time 21 cytokine were assessed. The repeat experiment was done using a healthy donor whole blood supernatant spiked at two different concentrations. The recoveries of all the cytokines in panel fell within the acceptable range of 70 to 130% except for MCP-1 and IL-6 for which the recoveries were 66 and 251% respectively. Figure 2 shows the recoveries of the LINCO -plex assay after the 2 independent experiments. Detail statistics are shown in Table 3. The reproducibility of the LINCO kits was done by measuring cytokine concentrations of the same whole blood supernatants across five LINCO 29-plex kits. Data analysis showed that the coefficient of variation, standard deviation and error were within acceptable ranges for most of the cytokines (see Table 4 for more details).

Manufacturer 3 assay
Experiment 1. The recoveries of 13 cytokines measured in whole blood culture supernatant and serum samples were assessed for six different concentrations using the RnD Systems Fluorokine-MAP assay. A total of 78 readings were made using whole blood culture supernatant and 67 from serum samples. All whole blood supernatant and serum sample readings were positive and within the standard curve range. A total of 67% of whole blood supernatant samples achieved recoveries within 70 to 130%, compared to approximately 56% of the serum samples. Experiment 2. The recoveries of IFN-c, TNF-a and IL-4 were assessed for seven different concentrations in whole blood culture supernatant and serum samples. In this experiment all whole blood supernatant readings were positive, whereas the detection limits in spiked serum samples were 44 pg/ml. About 50% of spiked whole blood supernatant achieved acceptable recovery (70 to 130%) compared to 75% for serum samples that were within detection range. Details on cytokine recoveries after the two independent experiments are shown in Figure 3 detailed statistics in Table 5.

RnD System's ELISA
The RnD System's IFN-c ELISA was used as a comparative test against which the different luminex kits where compared. Samples tested included antigen-stimulated samples with their negative and positive controls. As expected, the negative control showed a very low level of IFN-c, whereas the positive control and antigenstimulated samples showed higher levels of IFN-c (Table 6).

Manufacturer 1, 2, 3 luminex assays and RnD Systems ELISA: an IFN-c based comparison
Poor correlations were observed between the Bio-Rad luminex assay and the other assays. The correlation between the Bio-Rad luminex kit and the RnD Systems ELISA measurement gave an ICC of agreement of 20.01 and a Pearson correlation coefficient (r) of 20.09. The intra-class correlation coefficients (ICC) of agreement and the Pearson product-moment correlation coefficient (r) between LINCO luminex kits and RnD Systems ELISA kits were for the first test 0.64 (ICC) and 0.75 (r) and for the second test 0.75 and (ICC) and 0.84 (r) suggesting a posive correlation between LINCO luminex kits and ELISA. The correlation analysis between RnD Systems Fluorokine-MAP and RnD Systems ELISA was also shown to be positive with an ICC of agreement of 0.1 and a Pearson correlation coefficient (r) of 0.99. The correlation between the different luminex kits measurements for the cytokines present in the three kit panels is shown in Table 7.

Discussion
This validation study evaluated three commercially available cytokine multiplex bead immunoassays from Bio-Rad, LINCO-Inc and RnD Systems. The results suggest that, for the particular samples tested in this study, the LINCO Inc human 29-plex and the RnD Systems Fluorokine-MAP assays were the most accurate for the measurement of cytokine concentrations in whole blood culture supernatant and achieved good recovery ranges and reproducibility for most cytokines whereas the performance of the Bio-Rad human 17-plex assay was suboptimal. The comparative study, including the Bio-Rad human 17-plex assay, LINCO 29-plex assay, RnD Systems Fluorokine-MAP assay and RnD Systems ELISA, was made based on IFN-c responses in antigen-stimulated whole blood culture supernatant. It was found that all assays were capable of differentiating the positive and negative controls. Moreover, they were able to efficiently pick up the antigen-specific IFN-c responses when applicable, with the exception of the Bio-Rad human 17-plex assay, where IFN-c levels in two of the antigen-stimulated samples (ESAT-CFP-10 and Rv1115) went undetected.
Concentrations of IFN-c measured by the LINCO 29-plex assay, RnD Systems Fluorokine-MAP assay and ELISA correlated, but results obtained using the Bio-plex assay correlated poorly with values obtained using the other three assays. Very similar to the findings by DuPont et al. [5], a very strong correlation between the level of IFN-c measured by ELISA and the LINCO-plex kit in whole blood culture supernatant was found in the present study. Furthermore LINCO-plex assay, RnD Systems Fluorokine-MAP assay and the Bio-Rad 17-plex correlation of 13 cytokines showed a positive correlation between LINCO 29-plex and the RnD Systems Fluorokine-MAP assay for most of the cytokines, whereas the Bio-Rad 17-plex assay correlations to LINCO 29-plex assay and the RnD Systems Fluorokine-MAP assay were frequently negative. This contradicts a study by Khan et al. [11], who showed that cytokine levels of IL-6, IL-8, and TFN-a measured by the Bio-Rad Bio-plex assay have similar trends to the LINCO-plex and RnD Systems Fluorokine-MAP measurements at least for IL-6, IL-8, and TFN-a.
Although recoveries should ideally not fall outside the acceptable range, measurements may be considered useable if the recovery remains constant across different sample types and dilutions. This was not the case when measuring cytokines using the Bio-Rad kits in our study. Therefore it would be impossible to compensate for any inaccuracy evident in any one sample matrix. Discrepancies observed using the Bio-plex kits may be partly explained by the presence of interfering proteins such as heterophilic antibodies [10,14]. De Jager et al. [3] have described methods to avoid heterophilic antibody interference in plasma and synovial fluid that improved the performance of the multiplex immunoassay. However, any manipulation of samples in clinical studies is not necessarily advantageous due to possible unforeseen effects on the results.
It may therefore be necessary to perform careful optimisation and validation of any commercial multiplex cytokine assays prior to large-scale clinical studies, as the quality controls supplied with the kits to measure standard curve integrity can only guarantee the accuracy or sensitivity of the assay if they are reconstituted and measured in the same matrix type as the samples investigated. Matrix effects appear to play a major role in assay performance and the type of sample tested may therefore have serious effects on assay performance. It is clear that, at the present moment, the theoretical capabilities of this new technology cannot be fully achieved in practice [12]. Researchers using these kits should include replicates of samples as well as negative and positive (low, medium and high) controls with known concentrations of the cytokines of interest to aid in them in interpretation of results. Furthermore, controls should be included that reflect both the diluents used to reconstitute the standard supplied with the kit and the sample matrix tested in order to account for possible matrix effects. This will allow the assessment of linearity and recovery and aid in the choice of best standard curve regression and optimal calibration [2].
In conclusion, the most appropriate use for multiplex cytokine assays based on luminex technology currently is as a screening tool, for example for the selection of candidate markers characteristic of disease-associated immune responses. Promising candidates can then be validated using a method with higher accuracy and proven reliability, such as ELISA.   Table 7. Correlation between ELISA, LINCO 29-plex, Bio-Rad 17-plex and RnD Systems Fluorokine-MAP 13-plex assays.