A multicenter analytical performance evaluation of a multiplexed immunoarray for the simultaneous measurement of biomarkers of micronutrient deficiency, inflammation and malarial antigenemia

A lack of comparative data across laboratories is often a barrier to the uptake and adoption of new technologies. Furthermore, data generated by different immunoassay methods may be incomparable due to a lack of harmonization. In this multicenter study, we describe validation experiments conducted in a single lab and cross-lab comparisons of assay results to assess the performance characteristics of the Q-plex™ 7-plex Human Micronutrient Array (7-plex), an immunoassay that simultaneously quantifies seven biomarkers associated with micronutrient (MN) deficiencies, inflammation and malarial antigenemia using plasma or serum; alpha-1-acid glycoprotein, C-reactive protein, ferritin, histidine-rich protein 2, retinol binding protein 4, soluble transferrin receptor, and thyroglobulin. Validations included repeated testing (n = 20 separately prepared experiments on 10 assay plates) in a single lab to assess precision and linearity. Seven independent laboratories tested 76 identical heparin plasma samples collected from a cohort of pregnant women in Niger using the same 7-plex assay to assess differences in results across laboratories. In the analytical validation experiments, intra- and inter-assay coefficients of variation were acceptable at <6% and <15% respectively and assay linearity was 96% to 99% with the exception of ferritin, which had marginal performance in some tests. Cross-laboratory comparisons showed generally good agreement between laboratories in all analyte results for the panel of 76 plasma specimens, with Lin’s concordance correlation coefficient values averaging ≥0.8 for all analytes. Excluding plates that would fail routine quality control (QC) standards, the inter-assay variation was acceptable for all analytes except sTfR, which had an average inter-assay coefficient of variation of ≥20%. This initial cross-laboratory study demonstrates that the 7-plex test protocol can be implemented by users with some experience in immunoassay methods, but familiarity with the multiplexed protocol was not essential.

Introduction Micronutrient (MN) deficiencies include iron, vitamin A, and iodine amid other essential elements and vitamins [1,2]. It is estimated that over 2 billion people worldwide are directly affected by a MN deficiency [3]. Children and pregnant women are particularly at risk due to an inadequate diet that fails to meet the greater micronutrient requirements necessary for fetal growth or childhood development [4]. Iron, iodine and vitamin A are three of the micronutrients of greatest public concern [1]. MN deficiency can adversely affect the physiology of diverse organ systems, impairing, for example, ocular, immunologic, and neurological function, often causing irreversible damage [4]. As such the quality of life of those affected by MN deficiency is significantly reduced, making it critical to accurately assess the prevalence of micronutrient deficiency to allow the targeted implementation of micronutrient intervention programs among high risk populations and assess intervention outcomes [5,6].
Data harmonization for MN deficiency surveillance is challenged by the use of different survey biomarkers and methods by different labs. For example, vitamin A is determined via serum retinol or retinol binding protein 4 (RBP4) which do not always correlate well with each other [7][8][9], while iodine is measured using urinary iodine, thyroglobulin (Tg) or thyroid hormones [10][11][12]. Furthermore, the quantitative data generated by enzyme linked immunosorbent assays (ELISAs) is impacted by a variety of factors including the sample type (e.g. dried blood spot [DBS], serum or plasma from venous or capillary blood) [13][14][15], variations in the antibodies, buffers and protocols used by commercially available ELISA kits [16,17], a lack of international reference materials for some key biomarkers (e.g. RBP4), and a lack of external quality assessment (EQA) materials to consistently qualify tests and user performance resulting in variation in measurements across laboratories [18]. Finally, in some cases, routinely used immunoassays have not been fully validated by the manufacturer or by users to confirm acceptable assay performance for their intended use [19].
These challenges, either individually or in combination, result in poorer quality datasets that make it difficult to accurately and consistently monitor MN deficiency distribution, prevalence, and severity, in particular, across different surveys using similar but not identical analytical methods. Ideally, MN deficiency data collected in large surveys such as the Demographic and Health Surveys (DHS), which conduct surveillance in different populations globally and at multiple time points, should be uniform to allow constructive comparisons across countries and/or survey waves. A primary purpose of micronutrient status assessments is to understand what populations are most vulnerable and to assess the impact of interventions.
For the accurate inter-region comparison of MN deficiency surveys or along a series of time points, the harmonization of absolute measurements generated via all the analytical methods used is essential. This can be realized, in part, by using inter-laboratory performance studies and evaluations of these methods to identify technologies that are relatively easy to perform and have sufficient accuracy and reproducibility to generate comparable datasets irrespective of where the testing is carried out. We have reported previously on a multiplex assay method developed to simplify population surveillance of micronutrient status by combining relevant biomarkers into a single test. In this study we report results of a full formal validation of the Q-plex 7-plex Human Micronutrient Array (hereafter the 7-plex), examining the reproducibility observed with multiple users across seven different laboratories in order to characterize measurement variability for biomarkers pertinent to MN deficiency surveillance, namely inflammatory biomarkers alpha-1-acid glycoprotein (AGP) and C-reactive protein (CRP); thyroglobulin (Tg, iodine); serum ferritin and soluble transferrin receptor (sTfR, both iron); retinol binding protein 4 (RBP4, vitamin A); and histidine-rich protein 2 (HRP2, Plasmodium falciparum malaria) [9,11,20,21]. We assessed the precision and performance of the 7-plex for use in population surveillance of MN deficiency [22][23][24].

7-plex array procedure
The panel samples and controls were thawed on the bench top at room temperature on the day of the assay. Samples were processed following the assay protocol. First, the lyophilized competitor mix provided with the kit was reconstituted in the sample diluent volume recommended in the product insert to produce a 1X strength competitor mix. Next, the lyophilized calibrator was reconstituted with the competitor mix volume indicated in the kit insert, then a series of 7 threefold dilutions was prepared to create an eight-point standard curve. A 15 μL volume of each sample or quality control (QC) was combined with 135 μL competitor mix to produce final dilutions of 1:10. A volume of 50 μL per well of prepared standards, controls and samples were added to the plates in duplicate wells and each plate was incubated at room temperature for 2 hours with shaking on a flatbed shaker at 500 revolutions per minute. All reactions were aspirated, and the wells washed 3 times with the wash buffer provided with the kit. Next, 50 μL of detection mix was added to each well and the plate was then incubated with shaking for 1 hour and then washed one more time as described above. Labeling was performed by adding 50 μL streptavidin horseradish peroxidase solution to each well and shaking for 20 minutes. After washing 6 times, the chemiluminescent substrate mixture of equal volumes of parts A and B were added at 50 μL per well. Each plate was then immediately imaged at 270 seconds of exposure time using a Quansys Q-View™ Imager LS (Quansys Biosciences).
Q-View Software (Quansys Biosciences) was used to overlay a plate map onto the locations of analyte spots in each well to quantify the chemiluminescent signal from each spot in units of pixel intensity. The software applies the calibrator concentration values to the pixel intensities for each spot in the standard curve wells and was set to automatically fit optimal 5 parameter logistic calibration curves for each analyte. The pixel intensities of the spots in each test well were then used to interpolate the concentration of each analyte relative to its calibrator curve. Once the plate image is overlaid with the analysis grid, all of the curve fitting and data reduction steps are automatically applied via the software. The upper and lower limits of quantification determined by Quansys for each kit lot were applied to exclude values beyond the concentration ranges that yield precise concentration estimates.

Validation of 7-plex performance in a single lab
The intra-and inter-assay performance ranges of the 7-plex reported in earlier publications, along with other components of assay validation, were originally generated by the manufacturer of the assay, Quansys Biosciences [23]. As a follow up to this, before the inter-laboratory evaluation was performed, a second validation of 7-plex performance was conducted independently in the PATH laboratory to confirm the original findings (see Fig 1).
Validation materials. The test panel used to qualify the 7-plex performance consisted of Liquichek (LK) Immunology Control Level 3 (Lot # 66363, Bio-Rad Laboratories Inc., Hercules, CA, USA), a pooled human serum-based matrix containing most of the analytes of interest. As the LK control has low concentrations of both sTfR and Tg and is negative for HRP2, a spiked version was also prepared by adding concentrated sTfR antigen (Fitzgerald, MA, USA), Human Tg (BiosPacific Inc., Emeryville, CA, USA) and HRP 2 (CTK Biotech, San Diego, CA, USA) to better reflect the quantitative range of the array. Additionally, seven human plasma samples previously determined to have the highest sTfR measurements (all HRP2 negative) were selected from our US donor panel for use in the validation experiments [22,23].
Validation experiments. A flow chart in Fig 1 highlights the sequential processes used to validate the 7-plex assays, construct blinded test panels and finally carry out the inter laboratory assessment. Ten 7-plex plates in total, were employed to evaluate assay precision and linearity. Each plate performed two identical experiments, with all controls and test sample dilutions prepared independently for each experiment, with results derived from an independent standard curve for each half of a plate. Using a total of 20 replicate experiments, the LK and spiked Liquichek (SLK) were screened in triplicate as high (undiluted), medium (1:4 dilution) and low concentrations (1:10 dilution). The absolute values derived from these samples were used to calculate the independence of volume (linearity) and precision of each assay across its linear dilution range. The seven human plasma specimens were run in duplicate at a 1:10 dilution. To assess real-use imprecision, including common potential sources of variability, the 20 experiments used two different 7-plex plate lots and two different lots of calibrator, with experiments performed by two users. All testing was performed at ambient temperature (approximately 22˚C) following the 7-plex protocol as described in detail below. The ten plates generated 60 unique data points for each biomarker in the LK and SLK dilutions (run in triplicate in 20 experiments) and 40 data points per biomarker in the 7-member plasma panel (run in duplicate in 20 experiments).
Validation statistical methods. Validation experiment results were analyzed to estimate intra-and inter-assay imprecision and linearity. A variance components model was used to calculate intra-and inter-assay coefficients of variation (CV) for the LK, SLK and plasma sample results from the 20 independent experiment batches run on ten assay plates [25]. This method parses within-plate variance and between-plate variance to more accurately reflect the sources of variability in imprecision estimates. CVs were calculated with results grouped by plate lot, by calibrator lot, and by user, as well as in aggregate. CV is used to simplify interpretation of estimates of assay variability, but because it is a ratio of standard deviation to mean concentration, it tends to overstate variation at lower concentrations. Linearity was estimated using three dilutions of both the LK and SLK and was calculated by dividing the concentration of a diluted sample by the concentration of the next higher dilution, multiplying by the dilution factor and expressed as a percentage.

Inter-laboratory assessment
Donor panel. While the validation work was carried out with a mixture of serum and plasma samples we have previously demonstrated comparable results between paired serum and heparinized plasma samples [22], thus we were confident plasma samples would be appropriate for interlaboratory assessment. Plasma samples from a study of micronutrient status among pregnant women in Niger were used to generate test panels for the inter-laboratory assessment [23]. Samples were collected as part of a cross-sectional study embedded into the Niger Maternal Nutrition (NiMaNu) Project, which was registered with the U.S. National Institutes of Health (www.ClinicalTrials.gov; NCT01832688) [26]. The National Ethical Committee (Niger) and the Institutional Review Board of the University of California Davis (UC Davis; USA) provided ethical approval for the study protocol and the consent procedure. The local implementation was under the responsibility of Helen Keller International (Niamey, Niger), who followed relevant national regulations and laws applying to project implementation and foreign researchers. Written informed consent was obtained from all study participants.
A total of 18 rural health centers from 2 health districts in the Zinder Region were selected to participate in the NiMaNu project. In each community, pregnant women were randomly selected and invited to participate in the survey. They were eligible if they provided written informed consent, had resided in the village for at least six months, and had no plans to move within the coming two months. As part of the NiMaNu study, venous blood samples were collected and used to prepare heparinized plasma and DBS cards. PATH signed a material transfer agreement (MTA) with UC Davis and then each of the participating laboratories in this study signed an MTA with PATH prior to receipt of test materials.

Construction of blinded test panels.
The NiMaNu panel of 208 plasma samples were previously analyzed using the 7-plex [23]. As the 7-plex requires only 13.5 μL of plasma per test, multiple samples within this panel had a significant residual volume (>300 μL) of plasma. Using the original 7-plex data, seventy-eight samples with concentrations representing the full range for each analyte were chosen for sub aliquoting to create 16+ identical panels consisting of 78 separate 20 μL plasma samples as follows: The frozen plasma was thawed on ice, spun briefly in a microfuge and pipetted into sterile screw cap tubes, which were then stored -80˚C (Fig 1). Two of these samples were randomly chosen from the panel and all 19 of the aliquots prepared from these samples were assessed by the 7-plex to test for tube-to-tube variability that might have been introduced during sub-aliquoting. Both samples had an intra-assay CV < 10% for each analyte (S1 Table), confirming analyte uniformity across sample tubes. The original specimen identifiers of the remaining samples were replaced with sequential numbering from 1-76, effectively blinding the labs previously involved in studies that used specimens from this panel. The samples were stored at -80˚C until shipment to the partner laboratories.
In addition to the 76 member Niger heparin plasma panel samples, Quansys Biosciences prepared QC samples, named G and H, representing both high and low analyte values to be run on each plate (Fig 1). These quality controls were used to evaluate whether each plate used during this study would meet acceptance criteria ideally applied in the routine use of the kit. The controls were prepared by spiking serum with purified biomarkers as needed to reach the desired concentration of each biomarker [23]. Prior to distribution, the G and H controls were quantified by Quansys via a series of twenty independent test runs using the 7-plex to determine the expected values of all 7 biomarkers (S2 Table).
Laboratories. Seven distinct laboratories offered to be part of the inter-laboratory performance study, each providing data from at least one, and ideally two, laboratorians per facility. Laboratories at PATH, the University of Washington, Quansys, and UC Davis had previously collaborated to develop and verify the performance of the Human Micronutrient assay [22][23][24] (Fig 1). Other laboratories, including ones from the US Centers for Disease Control and Prevention (CDC, GA), Eurofins Craft Technologies, Inc. (NC), Binghamton University (SUNY), and the University of British Columbia have also been independently evaluating the performance of the Human Micronutrient assay [27][28][29][30]. Once each laboratory had signed the MTA to access the samples, two complete sets of 76 heparin plasma samples and two of the G and H quality control sets, were shipped on dry ice via overnight courier. Recipients acknowledged the panels' integrity (frozen with dry ice still in packaging) upon arrival and stored them at -80˚C until assay. The manufacturer of the assay, Quansys, was excluded from the study in order to limit bias, as their technical staff are most familiar with the platform and they manufacture and market the Human Micronutrient assay kit. Prior to performing testing, all laboratories were offered a training webinar hosted by an experienced Q-plex user (E. Brindle), to ensure that each study laboratorian was familiar with the test protocol and data analysis methods. All of the array kits used in the inter-laboratory assessment exercise were from the same manufacturing lot. Each plate image was saved and reviewed by an expert user (E. Brindle) to confirm consistency in software settings used to fit calibration curves and report results (Fig  1).
Assessment of laboratory equipment and user capability to operate the Q-plex assay. To understand effects of user skills and experience and status of laboratory equipment on results, a questionnaire was distributed prior to testing to collect details from each laboratory. Each operator completed a questionnaire to determine their level of previous experience with the 7-plex, and experience with quantitative immunoassays (Fig 1). An inventory of equipment summarized maintenance histories for items necessary for use with the 7-plex, and specified the plate washing method. Experience and equipment status questionnaire results were summarized by assigning a scale value to each element, scoring each factor as follows: Lab operator experience (2 elements, 1 to 3 scale, with 3 as most experience), Quansys software experience (0 to 1 scale, 1 is experienced), automated plate washer availability (0 to 1 scale, 1 is available), and recency of calibration (2 elements, 1 to 3 scale with 3 as most recent). Scores were totaled to derive a summary score ranging from 0 (no experience, poor equipment status indicators) to 14 (extensive user experience, all equipment present and recently calibrated).
Inter-laboratory statistical methods. Values below the lower limit of quantification (LLOQ) for each analyte were excluded from analyses. Results of the quality control samples run on every plate were evaluated to determine whether the plates would meet acceptance criteria that, for the purposes of this study, were intentionally less stringent than would generally be permitted, whereby at least one control result should have any 6 of the 7 analyte results falling within a 95% confidence interval calculated from all plates in the study. Because the intent of this exercise was to evaluate reproducibility, all plates were included nearly all subsequent analyses. The effect of excluding data from any plates meeting this rejection criteria was considered separately. Inter-assay CV's were calculated to evaluate the performance between the 7 labs and intra-assay CV's were calculated to evaluate the performance within each of the 7 labs. Intra-assay CVs for duplicate wells of the test samples were averaged for each analyte on each plate, and then plate averages were aggregated across analytes to summarize intra-assay CV averages by lab and by operator. Inter-assay CVs were calculated across all plates (n = 12) for each sample (n = 76); inter-assay CVs were then averaged to summarize inter-assay CV for each analyte. Agreement between results across laboratories was assessed using Lin's concordance correlation coefficient (CCC) [31]. Results from assays conducted in the PATH and UW labs by the three operators with the most experience using the 7-Plex were averaged to create a comparison set that was compared to each of the nine remaining assay batches from five labs. Lin's CCC was calculated using STATA version 15.1 (StataCorp, College Station, TX USA).

Validation of 7-plex performance in a single lab
All test data can be publicly accessed at Dataverse (https://dataverse.harvard.edu/dataverse/ micronutrient_immunoarray). The data derived from 20 independent replicate experiments run on ten 7-Plex assay plates in the PATH lab were used to evaluate the precision (intra-and inter-assay, n = 13 samples) for each assay (see Fig 1). Table 1 provides a summary of results from the validation sample with a value closest to the relevant cutoff concentration for each analyte. The intra-assay CV for each analyte was less than 5%, with the exception of ferritin (5.8%), and all inter-assay CVs were less than 15% (Table 1). These are the accepted maximum CVs for ELISAs and comparable to CVs observed previously in the manufacturer's evaluation of the 7-plex [23,32]. S3 Table includes all results, including those outside assay limits of quantification; average intra-assay CV was below 5% for all analytes. There was one plasma sample that gave an intra-assay CV of 15.7% with the ferritin assay. However, the concentration in this particular sample was around the LLOQ, and as the generally acceptable threshold at this concentration is 20%, this was still considered acceptable [32]. Average inter-assay CV was �15%, with the CV for most samples below 10% for each analyte.
Tests of assay linearity showed no evidence of systemic non-parallelism across dilutions for any analyte. All biomarkers, apart from ferritin, had a linearity of 96% to 99%. The ferritin gave poorer linearity of 57% with SLK samples; however, it was noted that the Tg added to the spiked Liquichek was derived and concentrated from whole blood, thus adding this also increased the concentration of ferritin to above the limit of quantification in the high dilution samples. In the normal Liquichek the linearity improved to 83%, though this was still substantially lower than the other linearity values observed. The pooled data presented in Table 1 and S3 Table demonstrates that different operators and/or plate lots did not impact performance. Overall the results confirmed the previously reported findings and the assay was considered suitable for the subsequent inter-lab study [23].

Inter-laboratory assessment
Eleven operators in seven labs tested the full set of 76 plasma samples with the 7-plex ( Table 2). In most cases, each laboratorian tested the entire panel of 76 plasma samples only once (requiring two assay plates per operator). In one laboratory, a single operator assayed the entire panel of 76 plasma samples twice (for a total of four assay plates). In four laboratories, two different users tested the complete panel. Overall, each specimen was tested in duplicate wells 12 times (i.e. 24 data points). All laboratories completed testing within 3 months of each other, and samples were kept frozen until the day of assay. While many of the partners had very limited experience with the Q-plex platform, their laboratorians did have variable levels of experience in performing other immunoassays. All laboratories had the required equipment, including multichannel pipettors and rotating plate shakers. All labs but two had an automated plate washer; the remaining labs used the manual plate washing protocol described by Quansys in the kit instructions for use. Each laboratory had a Q-View imager and analysis software necessary for reading the 7-plex plates and for processing the raw data into concentration values for each analyte. Scores were tallied with overall scores for each lab/laboratorian shown in Table 2. One lab had the maximum possible score of 14 indicating a highly

PLOS ONE
experienced laboratorian with access to recently calibrated equipment, while the minimum observed score was 6, indicating a laboratorian that had limited experience with ELISAs, who was not familiar with the Q-view software, and did not have access to a plate washer. A G-and H-quality control sample were included in duplicate on every plate (Fig 1), with results summarized in Table 2. Multiple analyte results for both controls on one plate were outside the 95% confidence interval derived from all plates included in this study. Normally this Table 2 would indicate a QC fail indicating the test results were not acceptable, however as this experiment was intended to assess variability across users and laboratories, results for all plates were included in subsequent analyses summarizing the intra-and inter-assay CVs irrespective of whether they passed QC or not. Summaries of the G and H quality control sample measures, intra-assay CVs for all samples and inter-assay CVs for two quality control samples, are shown by laboratory and operator in Table 2. Values outside the limits of quantification were excluded. The maximum possible number of valid results for each lab and each operator varied because of the different numbers of plates tested. Numbers of valid results also differed because some samples had concentrations near the limits of the quantification range, calculated from mean and standard deviations for replicate standard wells on each plate. Some specimens were within range on some plates and out of range on others; thus, the number of out-of-range values is reflected in the numbers of valid results included in Table 2. For intra-assay CV, all well-to-well CVs were averaged regardless of analyte; the averages ranged from 3.1 to 8.8%, with two results above the 5% threshold coming from one laboratory. Inter-assay CVs calculated using the two quality control samples (G and H) also showed differences by laboratory. Table 3 shows inter-assay CV by analyte for the panel of 76 heparin plasma samples run across all seven laboratories. HRP2 had the highest inter-assay CV (31.1%) but is the only analyte intended not to be interpreted quantitatively (e.g. a qualitative assay) and the test results were at the lower range of the calibration standard where the greatest variance is observed. Fig  2 shows the full distribution of results for each sample for every analyte, with the results plotted by the rank order of the mean concentration calculated using data from the plates tested in the PATH and UW laboratories (n = 3 plates per sample). The plots show the greatest scatter around these means at the lowest and highest concentrations. In general, inter-assay CVs were higher for those samples at the extremes of the analyte calibration ranges (S1 Fig). Mean standard deviation (SD) results for each assay batch for the panel of 76 plasma samples and a measure of agreement between the results across labs assessed using Lin's Concordance Correlation Coefficient are shown in Table 4. Rather than comparing each batch in a pairwise test against all other results sets for the sample panel, a predicate set of results was derived by averaging results from batches run by the three operators at the PATH and UW labs who had the most experience with the 7-Plex assay. Lin's rho was generally high, most often r c �0.9, and averaging r c >0.8 for all analytes. The ferritin assay had the highest concordance of all the analytes (r c averaging 0.958), while the concordance was lowest for CRP (r c Inter-assay CV calculated across all plates (n = 12) for each sample (n = 76); inter-assay CVs were then averaged for each analyte. CV calculations are shown with and without two plates with quality control specimen values outside the 95% confidence intervals (calculated from all plates included in this study) for multiple analytes.

Average intra-assay (within plate) %CVs, all samples
https://doi.org/10.1371/journal.pone.0259509.t003 Results are plotted on Log10 Y axes to reveal proportional differences at lower concentrations and have been sorted by rank order of the mean concentration from 3 plates (2 PATH, 1 UW). Horizontal line, optimal 7-plex cutoff value; for HRP2, line represents approximate pixel intensity corresponding to the cutoff concentration. Cutoff values were determined by ROC curve analysis using results from the NiMaNu study as a gold-standard, and using the cutoff thresholds applied in that study [23]. Black hash marks on y-axes indicate lot-specific upper-and lower-limits of quantification (see S4  Table); values have been adjusted to account for 1:10 sample dilution used for all samples. Results out of range are plotted as the limits values noted in S4 Table. AGP, α-1-acid glycoprotein; CRP, C-reactive protein; HRP2, histidine rich protein 2; RBP4, retinol binding protein 4; sTfR, soluble transferrin receptor; Tg, thyroglobulin.
https://doi.org/10.1371/journal.pone.0259509.g002 Table 4. Mean, standard deviation, and Lin's concordance correlation coefficient for 76 heparin plasma samples tested in 7 labs.    averaging 0.820). Concordance was notably lower across several analytes for the two batches from lab 7. Removing those batches increased the average Lin's rho for all analytes except ferritin. This result is consistent with the higher intra-and inter-assay CVs from that lab (Table 2), possibly reflecting the impact of imprecision on the concordance estimates.

Discussion and conclusions
This study evaluates the performance of a multiplex micronutrient surveillance tool that quantifies biomarkers of vitamin A, iron, and iodine deficiency, inflammation or infection, and malaria through validation experiments to estimate precision and linearity within a single lab, along with assessing inter-laboratory reproducibility. Our within-lab validation experiments repeat and expand upon previously reported assay performance evaluations [23]. The validations described here were conducted independently in PATH's laboratory and represent an expert user's experience of assay performance characteristics across 20 repeated experiments. The intra-assay CVs were good in this validation with only one biomarker, ferritin, being slightly out of range. For most analytes and most samples, the inter-assay CV was under 15% (Table 1 and S3 Table) which is an accepted maximum inter-assay CV for ELISAs and comparable to CV's observed previously for the 7-plex assay. Because the specimens we used for replication (e.g., commercially available control specimens with and without spiking with sTfR and Tg) included values at the low and high ends of the assay range, where estimated values can be less precise, generating higher CVs were to be expected. It was not possible to produce more concentrated versions with these biomarkers without significantly diluting the other analytes. While the plasma specimens were selected from our in-house panel for the validation study based upon the highest sTfR measurements, concentrations of sTfR and Tg in each sample were still lower than their respective concentrations in the SLK. This indicated that while the range used in the validation studies was lower than the range of quantification of the 7-plex assay, they still reflected the range found in most clinical samples. Tests of assay linearity showed no evidence of systemic non-parallelism across dilutions for any analyte.
Inter-assay CVs for biomarker measurements in the panel of 76 plasma specimens measured by 11 different operators in seven laboratories averaged 20.0%, with imprecision estimates highest in the semi-quantitative HRP2 assay (31.1%), and higher than generally accepted range of error for sTfR (25.4%) and CRP (21.2%). Lin's CCC was generally high (r c � 0.8 for 54 of 63 comparisons), but showed the same pattern observed in CVs, with two low (�0.5 for both AGP and CRP) concordance results from one lab.
Some of the imprecision is attributable to including specimens at the very low or very high ends of the working assay ranges. For CRP in particular, most of the imprecision is due to variability at very low concentrations, all of which were below levels that indicate infection or inflammation (S1 Fig). Some individual laboratories showed variability of their results, with one laboratory generally having higher inter-assay CVs (averaging 28.1%) for quality control samples as compared to the 6 other labs (ranging from 5.6 to 14.1%). One significant difference between this laboratory and most others was the absence of an automated microtiter plate washer. It is possible that manual washing compromises assay precision; further testing is needed to confirm that speculation. When excluding these two plates and one other plate with quality control results outside a 95% confidence interval, the average inter-assay CV decreases from 20.0% to 17.2%, and decreases to 16.2% when the semi-quantitative HRP2 assay is excluded.
Because the 7-plex assay method was designed for use as a surveillance tool in LMIC and academic research facilities, the needs and challenges for assay performance are different from those associated with clinical laboratories. Assay methods used for these purposes have in the past been selected in an ad hoc fashion as practical and financial constraints differ from those encountered in clinical laboratories. Laboratories in these settings are not routinely testing a steady number of samples as clinical laboratories do. Instead, research laboratories are engaged sporadically to assay large numbers of specimens over a short time frame. Maintaining consistency under that sporadic workflow is a particular challenge, even within laboratories. For micronutrient status surveillance, monitoring across time and space are necessary for assessment of progress, but this presents laboratory challenges. A single method that measures key indicators of nutritional status as a single tool, rather than a collection of assays from various sources assembled for individual surveys, offers opportunity to greatly improve comparability across sets of data. These benefits are realized only if the assay results are reproducible across laboratories. The results here suggest that the 7-plex can provide generally reproducible results across laboratories, with imprecision only slightly above the range acceptable for results generated within a single laboratory. They also highlight the need for including internal quality control specimens on every plate, and in cases where precise estimates are needed for values at the physiological extremes, for repeating testing at adjusted dilutions for specimens with concentrations at the margins of the assay range.
Work is ongoing to improve the performance of the 7-Plex assay and includes improvements to ferritin, RBP4, and sTfR assays. A challenge inherent to multiplex assay methods is simultaneously optimizing the assay performance for both high-and low-abundant proteins; forthcoming improvements to assay sensitivity for ferritin will allow for a change to the recommended sample dilution from 1:10 to 1:40, a change that is expected to improve precision for RBP4 assay values by bringing them closer to the middle of the reportable assay range, a previous criticism of the 7-plex [27,28]. Similarly, changes to the sTfR assay to improve precision are underway. A new version of the 7-Plex assay with these improvements is expected within the coming year.
The project team also recognizes a need for evaluations comparable to those described here to be conducted in a field study in Low and middle income countries (LMICs). In this setting, working protocols can be developed and validated with country partners to properly implement the 7-plex into supporting nutrition research and interventions.
Our multiplex tool has to potential to reduce labor, supplies, and sample volumes traditionally required for MN screening. By demonstrating comparable performance across multiple laboratories and users we believe this assay can be a key tool in the identification of populations with key micronutrient deficiencies as well as a monitoring tool following any subsequent interventions, generating high quality reproducible data. The Quansys system is a low cost technology that could be easily implemented in laboratories in LMICs where ELISA assays are routinely used and can be applied to multiple other biomarkers and sample types beyond the 7-plex described in this work [33][34][35][36].
Supporting information S1 Table. Assay precision for two specimens across 19 replicate aliquots. Within and between tube variation in measures from 19 aliquots for each of two samples selected at random from aliquots prepared for distribution to partner labs. CVs were calculated using the Rodbard variance components model. CV for HRP2 was not calculated because results were above the upper limit of detection. CV, coefficient of variation; AGP, α-1-acid glycoprotein; CRP, C-reactive protein; HRP2, histidine rich protein 2; N/A, not available; RBP4, retinol binding protein 4; sTfR, soluble transferrin receptor; Tg, thyroglobulin. (DOCX) S2 Table. The established values for the G and H controls developed for the 7-plex array. The expected value for each biomarker is shown in addition to the acceptable values for upper and lower limits. AGP, α-1-acid glycoprotein; CRP, C-reactive protein; HRP2, histidine rich protein 2; N/A, not available; RBP4, retinol binding protein 4; sTfR, soluble transferrin receptor; Tg, thyroglobulin. (DOCX) S3 Table. Assay precision and linearity for 7 plex array. AGP, α-1-acid glycoprotein; CRP, C-reactive protein; HRP2, histidine rich protein 2; N/A, not available; RBP4, retinol binding protein 4; sTfR, soluble transferrin receptor; Tg, thyroglobulin; LK, Liquichek LK; SLK, spiked Liquichek. After each analyte in parentheses are the LLOQ and ULOQ, respectively. (DOCX) S4 Table. The upper-and lower limits of quantification for each biomarker using the 7-plex assay. AGP, α-1-acid glycoprotein; CRP, C-reactive protein; HRP2, histidine rich protein 2; RBP4, retinol binding protein 4; sTfR, soluble transferrin receptor; Tg, thyroglobulin.