Assay optimisation and technology transfer for multi-site immuno-monitoring in vaccine trials

Cellular immunological assays are important tools for the monitoring of responses to T-cell-inducing vaccine candidates. As these bioassays are often technically complex and require considerable experience, careful technology transfer between laboratories is critical if high quality, reproducible data that allows comparison between sites, is to be generated. The aim of this study, funded by the European Union Framework Program 7-funded TRANSVAC project, was to optimise Standard Operating Procedures and the technology transfer process to maximise the reproducibility of three bioassays for interferon-gamma responses: enzyme-linked immunosorbent assay (ELISA), ex-vivo enzyme-linked immunospot and intracellular cytokine staining. We found that the initial variability in results generated across three different laboratories reduced following a combination of Standard Operating Procedure harmonisation and the undertaking of side-by-side training sessions in which assay operators performed each assay in the presence of an assay ‘lead’ operator. Mean inter-site coefficients of variance reduced following this training session when compared with the pre-training values, most notably for the ELISA assay. There was a trend for increased inter-site variability at lower response magnitudes for the ELISA and intracellular cytokine staining assays. In conclusion, we recommend that on-site operator training is an essential component of the assay technology transfer process and combined with harmonised Standard Operating Procedures will improve the quality, reproducibility and comparability of data produced across different laboratories. These data may be helpful in ongoing discussions of the potential risk/benefit of centralised immunological assay strategies for large clinical trials versus decentralised units.


Introduction
Immune bioassays are essential tools with which to measure the immunogenicity of vaccines. Some of these are well established and can indicate the presence or absence of protection in vaccinated individuals where the assay detects a reliable correlate of protection such as neutralising antibody [1]. However, bioassays used to determine the immunogenicity of T-cell-inducing vaccines are often more complex and require a greater degree of operator expertise and experience. There are many T-cell effector mechanisms such as surface receptor upregulation, cytokine secretion, proliferative capacity, degranulation and cytotoxic capabilities that may be measured. Most T-cell assays require an antigen re-stimulation step, which is necessary to detect vaccine-specific responses that are recognised as often representing only a small fraction of the total T-cell compartment [2]. Some T-cell assays can simultaneously measure more than one parameter of interest such as responder cell phenotype together with cytokine secretion profile using flow cytometric assays or multiple secreted effector molecules using multiplex bead array or dual/triple colour enzyme-linked immunospot (ELISpot). Others only measure one effector function (e.g. single cytokine ELISpot or enzyme-linked immunosorbent assay (ELISA) for interferon-gamma (IFNγ)).
With a wide array of immunological bioassays available, as well as numerous potential modifications depending on the focus of a study, it is not surprising that a review of the literature reveals various bioassays in use, even for vaccines against a single pathogen [3][4][5][6][7][8][9][10]. There are advantages to a more consistent and co-ordinated approach to cellular immuno-monitoring within candidate vaccine trials, particularly to facilitate inter-site comparisons of the same and different candidate vaccines and to reduce the inherent variability in data generated by these complex assays. There is also an ongoing discussion of the merits and risks associated with centralising immune bioassays to one laboratory, versus the complexity of comparing data from multiple laboratory sites, this study may provide data to aid those discussions.
TRANSVAC was a European Union, Framework Program 7-funded consortium project with the aim of accelerating the development of promising vaccine candidates by developing, optimising and standardising state-of-the-art processes and facilities available to vaccine developers to bridge the gap between bench research and clinical assessment of novel vaccines [11]. We describe here our efforts to optimise and standardise one of these processes, namely the monitoring of vaccine immunogenicity using bioassays. We compared three cellular immunological assays in three participating laboratories: an ELISpot assay; an intracellular cytokine staining (ICS) assay and an ELISA assay, each designed to measure antigen-specific IFNγ. These assays are three of the most important and commonly used assays in pre-clinical studies and clinical trials of vaccines for tuberculosis, malaria and Human Immunodeficiency Virus [2,[12][13][14]. The project focussed on standard operating procedure (SOP) transfer and assay establishment, followed by harmonisation, and assessment of performance and reproducibility between groups. Critically, common reference standards were shared among participating laboratories to enable optimisation to be better assessed, including IFNγ protein standard for ELISA and cryopreserved donor peripheral blood mononuclear cells (PBMC) for all experiments. This manuscript describes the role of SOP harmonisation and operator training in improving inter-site reproducibility of assay performance.

Study design
The main aim of the study was to compare the performance of three cellular immunological assays across three different laboratories. The objectives were first to assess the results obtained using the assays when SOPs were shared between participating laboratories in their existing format, and second to investigate what measures could be employed to improve the reproducibility of each assay. The study therefore comprised a series of three experimental "rounds". Operators performed round one assays in their own laboratory setting following a straightforward sharing of agreed assay SOPs and cryopreserved PBMC aliquots from healthy adult buffy coats. Round two consisted of side-by-side training sessions during which operators performed assays together in the lead laboratory for each assay to identify variability in technique and practice not captured by the SOPs. Round three assays took place following training sessions and involved operators performing assays individually in their own laboratories. In order to strengthen the assessment of improvements in assay performance, aliquots of the PBMC samples used in round one were used again in round three.

Donors
PBMC were chosen as the tissue for investigation as these are more easily cryopreserved and shared between groups. Heparinised peripheral blood samples were either obtained from healthy adult donors (for some inter-group "side-by-side" comparisons) or from local blood transfusion services as buffy coats (all other inter-group comparisons). PBMC from these samples were cryopreserved at one laboratory and equal numbers of vials distributed to the other two laboratories. The Ethics Committee of the London School of Hygiene and Tropical Medicine (ref. 5520) and the NHS Berkshire Research Ethics Committee (REC ref. 06/Q1602/146) gave ethical approval for the use of these samples. Informed, written consent was obtained from adults who donated blood directly to the study.
Sample processing PBMC were isolated from blood samples and cryopreserved in aliquots. Briefly, following isolation from whole blood by density centrifugation, PBMC were counted and re-suspended in foetal bovine serum (FBS). Cells were then chilled for 30 minutes (min) on ice after which an equal volume of chilled FBS containing 20% dimethylsulfoxide was slowly added. Cells were distributed into Cryovials (Nunc) at 5x10 6 cells per tube and frozen overnight at −80˚C in Mr. Frosty containers (Nalgene) before transfer to liquid nitrogen. Prior to use, an appropriate number of cryovials were defrosted by each operator. Vials were thawed in a 37˚C water bath until only a small bead of ice remained. The contents were then transferred to a pre-prepared centrifuge tube containing R10 assay medium (RPMI 1640; 10% FBS; 2 mM L-glutamine; 1% penicillin/streptomycin; 1% sodium pyruvate, all from Sigma), centrifuged then re-suspended in R10 containing 10 units per ml of Benzonase (Novogen). Cells were then rested at 37˚C for 2 hours (h) prior to use.

ELISA
IFNγ ELISAs were performed on supernatants generated in PBMC stimulation assays. PBMC prepared as described above were incubated in R10 medium at 2x10 5 cells per well in U-bottomed, 96-well plates (Costar). Antigens were added to achieve a final volume of 200 μl per well and at concentrations of 10 μg/ml Mycobacterium tuberculosis purified protein derivative (PPD for in vitro use, batch RT50; Statens Serum Institute) and 5 μg/ml phytohaemagglutinin (PHA; Sigma). Medium alone was used as a negative control. After incubation for 72 h at 37˚C, supernatants were harvested and stored at −80˚C for ELISA analysis. ELISA plates were coated overnight at 4˚C with 2 μg/ml anti-IFNγ capture antibody (BD Biosciences), washed (phosphate-buffered saline (PBS) with 0.05% Tween 20) and blocked with PBS containing 10% FBS for 2 h at room temperature. After a further wash, 50 μl of PBMC assay supernatant, either undiluted or diluted to 1/3, 1/9 or 1/27, was added to wells in duplicate as was 50 μl of IFNγ protein standard (BD Biosciences) in doubling dilutions from 4000 pg/ml to 31.25 pg/ml and 50 μl of IFNγ positive control supernatant. Plates were incubated overnight at 4˚C, washed and probed with anti-IFNγ-biotin (BD Biosciences), Avidin-Peroxidase (Sigma, UK) and finally OPD Fast solution (Sigma, UK) for colour development. Plates were read at 490nm. ELISA IFNγ concentrations were estimated from linear standard curves generated in Microsoft Excel by plotting optical density readings against standard concentrations. As described previously for this ELISA method [15], the upper and lower limits of detection corresponded to the highest and lowest standard curve values (4000 pg/ml and 32 pg/ml respectively) although in this manuscript concentrations above 4000 pg/ml have been obtained and reported following extrapolation from the standard curve. Values below the lower limit of detection were given the value of 15.5 pg/ml (half the lower limit of detection).

ELISpot assay
ELISpot plates were pre-coated overnight at 4˚C with 15 μg/ml of anti-IFNγ coating antibody (MabTech, Sweden) followed by washing and blocking for 2-5 h in R10 medium. PBMC were prepared as described above and finally re-suspended at 3.75x10 6 cells per ml in R10. The blocking solution was removed and PBMC added to ELISpot wells at 3x10 5 PBMC per well in 80 μl of R10. Stimuli and controls were prepared at appropriate concentrations in R10 and 20 μl of each stimulus added to wells in replicates of 6 (Final concentrations-20 μg/ml PPD; 1 pg/ml Staphylococcus Enterotoxin B (SEB); 6.25 μg/ml FEC peptides (pool of 32 peptides from Flu/EBV/CMV)). For samples from donors who on previous occasions had displayed strong responses that produced too many spots to count, PBMC were added at half the usual number (1.5x10 5 per well) for certain stimuli to achieve countable spot numbers. ELISpot plates were incubated overnight at 37˚C then washed with PBS with 0.05% Tween 20. A biotinylated anti-IFNγ detection antibody (Mabtech, Sweden) was added to all wells at 1 μg/ml for 2 h at room temperature. Plates were washed again and streptavidin-alkaline phosphatase reagent (Mabtech, Sweden) added for 1 h at room temperature. After a final wash, plates were developed with NBT/NCIP reagent until spots were visible. The reaction was stopped by washing plates with tap water. After drying overnight, spots were enumerated using an automated ELISpot reader (AID version 5.0). The same ELISpot reader, software and count settings were used at all sites and ELISpot results were reported in spot-forming cells (SFC) per million PBMC.
ICS assay PBMC were prepared as described above, re-suspended in R10 medium at 10 6 cells per ml and distributed into 5 ml FACs tubes (Invitrogen, UK) at 10 6 cells per tube; one tube per stimulation condition. Stimuli were added to each tube as appropriate (R10 medium as a negative control; PPD at 20 μg/ml; SEB at 5 μg/ml; FEC peptides at 25 μg/ml) and samples were incubated at 37˚C for 2 h. After this time, 3 μl of brefeldin A (BFA, Sigma, UK; stock concentration 1 mg/ml) was added to all tubes to give a final concentration of 3 μg/ml, and tubes were incubated for a further 18 h (overnight) at 37˚C. Following stimulation, PBMC were washed in FACS buffer (PBS with 0.1% bovine serum albumin (Sigma) and 0.01% sodium azide (Sigma)) and stained with VIVID live/dead reagent (Molecular Probes) as well as with a surface stain cocktail of antibodies (anti-CD4-APC-Cy7 (Biolegend); anti-CD14-Pacific Blue (Invitrogen); anti-CD19-Pacific Blue (eBiosciences)). After further washing, PBMC were permeabilised with Cytofix/Cytoperm reagent (BD Biosciences) and stained with an intracellular antibody cocktail (anti-CD3-PerCP (Biolegend); anti-CD8-FITC (Biolegend); anti-IFNγ-PE (Caltag)) prior to a final wash and re-suspension in 1% paraformaldehyde. Cells were acquired within 24 h of staining.

Data analysis
Flow cytometric analysis was performed using FlowJo software (Treestar). Gating was performed using a previously published gating strategy [16]. For all assays, averages were calculated from replicates of each condition and background measurements from negative controls were subtracted to give final data points using Microsoft Excel. Background data (negative control data measured in all assays, for each round and for each site) are presented as supplementary data (S1 Tables). Background corrected data was plotted in Prism 7.0 (GraphPad) and Microsoft Excel. In order to allow site-to-site assay variability to be determined by coefficients of variation (CVs), mean and standard deviation of responses measured at all 3 sites were calculated in Microsoft Excel and are presented as summary statistics. Pre-and posttraining inter-site coefficients of variation (CVs) were compared using Mann-Whitney U tests. Logarithmic regression analysis to determine the relationship between inter-site CVs and magnitude of response measured for each assay was performed in Microsoft Excel.

Results
Initial inter-site assay comparisons using shared protocols, reagents and stimuli Following an initial review of different versions of assay SOPs used in each participating laboratory, we found that SOPs for the same assay varied considerably across the groups. Therefore, the optimal version of each SOP was selected and shared between sites for all further experiments. Reagents and stimuli were also ordered centrally and shared.
A common set of frozen PBMC from three donors was used to test immune responses at three sites. ELISA, ELISpot, CD4+ T-cell ICS and CD8+ T-cell ICS assay results are shown in Fig 1A-1D. Mean assay responses across sites and assay CV data are shown in Table 1. Despite the use of identical SOPs, there were still notable variations in the responses to the antigen stimuli PPD (for all assays) and FEC (for ELISpot and ICS assays). Sites were more consistent in their measurements of strong responses to the positive controls (SEB and PHA) although this varied across Three assays, three blood donors and three antigens/positive control stimulants were used in each laboratory site to compare measured immune responses different assays. For example, the ELISpot assay demonstrated a greater variation in response to SEB. In general, there was more variation across sites when measured responses were weak. This was particularly noticeable in situations where an antigen stimulus was sub-optimal for the induction of a response, e.g. FEC-induced CD4+ T-cell and PPD-induced CD8+ T-cell responses.

Side-by side, operator training eliminates minor variations in approach between sites
The next step taken to eliminate inter-site variation was an "operator training" session. Although operators at each site possessed considerable expertise and experience in performing the assays involved, it was agreed that variations in each operator's training, differences in SOP interpretation, "common practice" in a given laboratory and previous experience might lead to measurable differences when two operators' data were compared. To counter this, each assay was performed in a side-by-side arrangement where one operator acted as curator for a given assay and trained the other operators in the specific individual approach of that operator to further harmonise the technique of operators when acting alone. As shown in Fig 2 and Table 2, consistency between results produced by each operator improved considerably when assays were performed side-by-side in the same laboratory. As in round one pre-training assay comparisons, inter-site assay CVs were higher when the responses measured were weaker as seen with the more diluted supernatants for the ELISA assay (Fig 2A).
Side-by-side operator training improves future across-site consistency of data for individually performed assays Following the side-by-side operator training sessions, each operator working individually at their own site repeated assays using aliquots of the same three donor PBMC described for round one experiments. Results of this round of experiments are shown in Fig 3 and Table 3.
When all the inter-site CVs for different conditions (donor and stimulant combinations) are compared to those from previous rounds, a reduction in the mean inter-site CV as compared to the pre-training data was achieved for all assays (Fig 4). For the ELISA assay, the post-training data maintained the level of inter-site comparability seen for the training data and this was a significant improvement on the pre-training data (p = 0.017). Although the ELISpot and ICS assays did not manage to maintain the level of inter-site comparability seen when assays were performed side-by-side, a significant improvement in inter-site comparability between pre-and post-training data was seen for the ELISpot assay (p = 0.039). The improvements in CD4+ T cell and CD8+ T cell ICS assays post-training did not reach significance.
There was a trend for both the ELISA and ICS post-training datasets, although not for the ELISpot dataset, whereby the inter-laboratory CVs increased dramatically below a certain magnitude of measured response (Fig 5).

TRANSVAC assay SOPs
Three detailed assay SOPs, produced as a result of the TRANSVAC project, are available online: http://www.transvac.org/SOPimmunoassays130627.pdf Operators from three sites performed three assays in a side-by-side fashion for the purposes of training

Discussion
Previous studies have addressed the issue of immune bioassay performance across multiple laboratories (including ELISpot, ICS and tetramer staining) and have highlighted the importance of removing protocol variations for improved inter-site comparability of data [17][18][19][20].
One of these studies also revealed that the number of years' experience a particular laboratory group have with an assay, does not correlate with the ability of that laboratory to perform the assay better [17]. In agreement with that study, we also find here that guidelines and detailed protocols can only improve assay reproducibility across laboratories up to a point. We show that first hand observation and protocol training may achieve further improvement, even within experienced laboratories. It is easy to envisage how a group with many years' experience of performing an assay might not appreciate the importance of external training of either long-standing or new personnel, when the necessary expertise exists "in-house". However, the combined efforts of this and the studies referred to above demonstrate that this process should be deemed essential in inter-laboratory harmonisation, if multiple sites are to produce comparable data, irrespective of each site's experience. Even though the process of inter-laboratory training improved the comparability of data, some assays harmonized more readily than others did. Performance of the IFNγ ELISA assay post-training was as consistent across sites as it had been when assays were performed side-byside and this was a significant improvement on inter-site comparability prior to training. Although ELISpot assay comparability also improved significantly following training, the posttraining data was not as consistent across sites as the data produced during side-by-side assays, despite using the same spot counting equipment, software and count settings. There was a Immune responses were tested using three assays, three blood donors and 3-4 antigens/positive control stimulants.
trend towards improved comparability post-training for both CD4+ T cell and CD8+ T cell ICS assays but this did not reach statistical significance. The performance variability of the ICS assay was also apparent in the less pronounced reduction in inter-site variability at higher response magnitudes. Although both the IFNγ ELISA and ICS assays showed a significant trend towards improved inter-site comparability at higher measured responses, the effect was stronger in the former assay where inter-site CVs were below 10% at the highest measurable responses. Even at higher measurable responses, the ICS assay did not result in inter-site CVs of greater than 40%. One of the most likely sources of additional variability for ICS assays is the flow cytometer instrument used, as this was the only parameter it was not possible to harmonise across each site.
We are not aware of any studies that take a systematic look at parallel technologies for a given biological measurement, identifying key parameters affecting concordance between and Optimising immune bioassays for clinical trials within laboratory groups. Others have described the in-depth qualification of individual assays similar to those we discuss in this manuscript [21,22], however this was not our aim here. Although the current study was relatively small, we have shown that parallel comparisons can reveal the critical aspects of each assay to focus on for better reproducibility. The use of standardised reagents, cells, reference materials and protocols, as well as staff expertise and training are critically important to understand the underlying diversity in outcomes from a given assay.
Although this study utilised experimental reagents and tools linked to tuberculosis vaccine research, the assays in question are equally applicable to diseases such as malaria and HIV. However, a cost-benefit analysis will be necessary as to whether money is best spent transporting all samples to a centralised analysis laboratory or on implementing the measures described here to ensure assay comparability across multiple laboratories. A centralised approach would ensure consistency in antigen and reagent usage, especially if the site in question was responsible for producing reference batches of these items. For example, following the current study, there was an interruption in the supply of PPD from Statens Serum Institute, however supplies of the same antigen from the UK National Institute for Biological Standards and Controls are available and induce equivalent assay responses.
Our conclusions, based on the present study, are that protocol optimisation and side-byside training of operators may improve assay comparability across sites. However, the data suggest that below a certain threshold of response level, cellular assays become unavoidably variable in terms of responses measured. In addition, this effect appears to depend upon the assay in question. Assays that measure cellular responses directly, such as ICS, are more prone to variability. Assays such as the PBMC/IFNγ ELISA are less prone to such variations, probably as much of the assay takes a non-cellular detection approach (i.e. antibody pairs and a protein standard curve). These points should be borne in mind when selecting immuno-monitoring assays for use in vaccine clinical trials. The findings of this study may also inform the ongoing discussions on the strategies, investments and risks associated with centralising clinical immune studies to one lab versus decentralising and the potential for extracting GCP/GLP compliant data from multicentre studies.
Supporting information S1 Tables. Background data as measured in unstimulated negative controls for ELISA, ELISpot and ICS assays run at each site. (DOCX)