Discrimination of Influenza Infection (A/2009 H1N1) from Prior Exposure by Antibody Protein Microarray Analysis

Reliable discrimination of recent influenza A infection from previous exposure using hemagglutination inhibition (HI) or virus neutralization tests is currently not feasible. This is due to low sensitivity of the tests and the interference of antibody responses generated by previous infections. Here we investigate the diagnostic characteristics of a newly developed antibody (HA1) protein microarray using data from cross-sectional serological studies carried out before and after the pandemic of 2009. The data are analysed by mixture models, providing a probabilistic classification of sera (susceptible, prior-exposed, recently infected). Estimated sensitivity and specificity for identifying A/2009 infections are low using HI (66% and 51%), and high when using A/2009 microarray data alone or together with A/1918 microarray data (96% and 95%). As a heuristic, a high A/2009 to A/1918 antibody ratio (>1.05) is indicative of recent infection, while a low ratio is indicative of a pre-existing response, even if the A/2009 titer is high. We conclude that highly sensitive and specific classification of individual sera is possible using the protein microarray, thereby enabling precise estimation of age-specific infection attack rates in the population even if sample sizes are small.


Introduction
Yearly epidemics of influenza A are the cause of a variable burden of disease that can be substantial in years with high influenza activity [1][2][3][4]. To date, the methods of choice for classification of individuals as infected, immune, or susceptible using serum are the virus neutralization, complement fixation, and hemagglutination inhibition (HI) tests. These tests have a long history, have been validated against positive and negative samples, and have proved their value in countless studies.
Traditionally, the gold standard for detecting influenza infections is by the use of paired serum samples, the first taken in the acute phase of infection and the other several weeks later. A significant (usually fourfold) increase in antibody titers is subsequently taken as evidence for recent infection. In practice, however, it is both costly and logistically challenging to obtain such samples. Consequently, residual or other one-point serological samples are often used instead, and classification is based on a high antibody titer in the one-point sample. Such classifications, however, may lack in sensitivity, especially when it comes to distinguishing between persons that have been infected recently and persons that have been infected with similar viruses in the past.
Moreover, in comparative studies when multiple antigens need to be tested the traditional tests are laborious, and need a significant amount of serum. Recent studies have made increasing use of novel diagnostic assays based on protein microarrays [5][6][7][8]. Advantages of the protein array are the smaller volumes of blood, the possibility of simultaneous testing of samples against multiple antigens, and potentially the test characteristics.
In the Netherlands, two serological studies had been conducted before and after the H1N1 pandemic of 2009 [9]. In these studies, samples had been analysed with HI to obtain estimates of the agespecific attack rates, by comparison of post-versus pre-pandemic seropositivity. Here, we analyse a subset of these samples with the newly developed protein microarray. Our aims are to explore the diagnostic characteristics of the microarray, and in particular to investigate whether the microarray would enable reliable classification of persons as being recently infected (with A/2009 H1N1), or having a response resulting from infection(s) in previous years.
The data are analysed using mixture models. In contrast to traditional analyses which use a fixed cut-off value to classify each sample into one class (susceptible, immune, recently infected), mixture models estimate the probability that a sample belongs to one of these classes. Hence, mixture models provide a natural way to include uncertainty in the classification procedure, and also enable investigation of optimal cut-off values [9,10].
Two age-stratified population based surveys had been conducted in the Netherlands before and after the pandemic of 2009 [9]. Here, we analyse a structured random subset containing 167 and 190 sera from the earlier study (Table S1). The two samples are stratified by age (0-4, 5-9, 10-19, 20-44, 45-64, and 65+ years), as recommended by the Consortium for the Standardization of Influenza Seroepidemiology (consise.tghn.org). Further, children under the age of five are excluded due to the small number of participants [9], and persons receiving pandemic vaccinations and elderly (65+ years) are excluded because of the interference of vaccination with the test results [8]. We also excluded sera from the pre-pandemic survey that had been collected after 12 th of October 2009, which marks the onset of sustained transmission in the Netherlands.
The aim of the earlier study was to obtain estimates of agespecific infection attack rates, and sera had been analysed with a hemagglutination inhibition test (HI). Most of the samples in the earlier study tested negative using HI. To prevent a random sample being drawn that contains mostly test negative sera, we stratify the sampling procedure by HI titer. One group contains sera that tested negative, one group contains sera with a low to intermediate standardised HI titer (positive but ,40; henceforth called intermediate titer), and one group contains all sera with a intermediate to high standardised HI titer ($40; henceforth called high titer). This procedure stratifies the population by age, (standardised) HI titer, and survey (pre-versus post-pandemic). Two strata contain no data, as all persons aged 5-9 years tested negative in the pre-pandemic sample. For the remaining 28 groups we have drawn a random subset for analysis (Table S1). The original surveys contained two random samples of the population, and therefore so do the subsets. Our stratification scheme enables weighing of the sera in the subset to represent a random sample from the Dutch population.
The study was approved by the Medical Ethical Testing Committee of Utrecht University (Utrecht, the Netherlands), according to the Declaration of Helsinki (protocol 66-282/E). Written informed consent was given by participants (or next of kin/caregiver in the case of children) for suitably anonymised clinical records to be used in this study.

Hemagglutinin (HA1) microarray
The subset of sera from the original study was analysed with a microarray as described earlier [5][6][7][8]. Briefly, recombinant proteins were produced in human embryonic kidney cells (HEK293) and purified by HIS-tag purification (purity more than 95%), as specified by the manufacturer (Immune Technologies, New York, USA). Oncyte avid nitrocellulose film-slides containing 64 pads per slide were used (Grace bio-labs, Bend, USA), and spot signals were quantified by the use of a Scanarray scanner (Perkin Elmer, Waltham, USA) using an adaptive circle quantification method. Finally, conjugates consisted of goat anti-human IgG (Fcfragment specific) conjugated with Dylight649-fluorescent dye (Jackson Immuno Research, West Grove, PA, USA). Table 1 shows the antigens included in the study. Notice that next to the antibody response against the A/2009 (H1N1) pandemic virus, we tested the samples against a range of other antigens, among which A/1918 (H1N1). The hemagglutinin of H1N1 virus of 1918 is genetically and antigenically related to the 2009 virus [11][12][13][14]. Readers of each test (HI and microarray) were blind to results of the other tests, and had no access to ancillary information (age, sex).

Mixture model
We use a mixture model to provide a probabilistic classification of individual samples and estimate age-specific infection attack rates. The mixture model contains three component distributions that model the responses across age groups. The first distribution describes samples of low antibody titer, pertaining to susceptible persons. The second distribution describes samples of intermediate antibody titers and aims to identify persons that have pre-existing antibodies, hereafter named the immune component distribution. The third distribution describes samples of high titer, and aims to identify persons infected during the pandemic.
The susceptible and immune component distributions are fitted to pre-and post-pandemic data, while the infected component distribution is fitted to the post-pandemic data only. We assume that there are no age dependencies in the component distributions, and fit Gaussian distributions to the log 2 antibody titers. We collect the means (m sus , m imm , and m inf ) and standard deviations (s sus , s imm , s inf ) of the distributions in parameter vectors (h sus , h imm , h inf ), and denote by f(x; h) the densities of the distributions. The weights of the distributions are determined by two mixing parameters per age group, viz. q a , the probability that a person in age group with label a belongs to the immune component, and p a , the probability that a person with age label a is in the infected component. Hence, 1-q a and 1-p a -q a are the probabilities that a person belongs to the susceptible component in the pre-and postpandemic surveys. Notice that we make the implicit assumption that the fraction of persons in the immune component remained constant in the short time span (#6 months) between the two surveys. In the following, the age-specific weights are collected in vectors p and q. At the individual level, the probability that a person in the post-pandemic survey with age label a is infected is given by the product of the mixing parameter p a and the local density of the infected component distribution, normalised by the sum of these quantities over all component distributions (susceptible, immune, infected).
The statistical analyses are based maximization of the loglikelihood. In the following we denote by n pre and n post the number of samples in the pre-and post-pandemic survey, by d pre (i) the log 2 antibody titer of sample i in the pre-pandemic study, by The total log-likelihood is given by the sum of the pre-and postpandemic log-likelihoods. In practice, the above formulations need to be adapted slightly to account for left-censoring of samples below the detection limit [9]. Notice furthermore that HI measurements are interval-censored, as the data are based on analysis of serial dilutions, and this has been taken into account in the analysis of HI data [9].
To investigate whether classification of individual samples can be improved by the inclusion of a second antigen, we extend the univariate mixture model described above to a bivariate mixture model. The analysis of the extended model runs along the same lines as outlined above, the main difference being that the component distributions are now specified not by a single mean and standard deviation, but by two means (e.g., m (2009) imm and m (1918) imm ), two standard deviations ( (2009) imm and (1918) imm ), and a covariance (r imm ). Hence, the equations remain the same, but in this case the parameter vectors contain five instead of two elements.

Estimation
The mixture models are fitted using Markov Chain Monte Carlo methods. Specifically, we use a random walk metropolis algorithm with normal proposal distributions and the current value as mean [15]. For each analysis, we run the process for 100,000 cycles, and obtain a thinned sample of 24,000 after a burn-in of 4,000? Convergence and mixing are assessed visually. A maximum likelihood estimate of the parameters is obtained, and limits of 95% parameter confidence intervals are determined by taking 2?5% and 97?5% quantiles. All statistical procedures have been programmed in R version 3.0.0.

Results
There is a positive overall correlation between HI and the microarray response to A/2009 (Kendall's tau = 0?45, p-value , 0?001). The correlation is stronger in the post-pandemic study (tau = 0?57, p-value ,0?001) than in the pre-pandemic study (tau = 0?28, p-value ,0?001), and is strongest in young children (5-9 years) in the post-pandemic study (tau = 0?77, p-value ,0?001). A further comparison shows that 100 out of 357 samples (28%) test negative in HI but have a positive response in the microarray (Figure 1). The opposite is true for just 23 persons (6%). The number of people that test negative in the HI but positive in the microarray increases with age (p-value ,0?001, tested with a logistic regression) and does not appear to be affected by seasonal vaccinations (p-value = 0?49).
In young children (5-9 years) there is a perfect distinction between persons that were likely infected, and those that remained susceptible. In fact, in the pre-pandemic study there are no young  (Table 2), yielding a clear bimodal distribution of antibody titers in the post-pandemic study (Figure 2). A bimodal distribution is also apparent in older children (10-19 years) and younger adults (20-44 years) in the post-pandemic study, albeit less pronounced. In older adults (45-64 years), the bimodality of the distribution of antibody titers in the post-pandemic study has disappeared. Subtracting post-and pre-pandemic prevalences yield rough estimates for the age-specific infection attack rates, suggesting that infection attack rates are high in young children (64%) and low in older adults (,4%)( Table 2). Formal analyses using mixture models yield comparable estimates (Table 3). Above the age of 20, the attack rates decrease less sharply in the bivariate model, as the bivariate model is better able to identify infected persons (see below). Figure 3 shows the bivariate microarray data (dots), the fitted bivariate immune and susceptible component distributions (contours), and the regions of high estimated infection probability (shaded areas). There is a positive correlation between the test results for A/2009 and A/1918, as would be expected. Further, the infected component distribution is located at modestly higher A/2009 titers than the immune component distribution, and the A/2009 antibody titer alone appears to be insufficient to separate infected persons from those with pre-existing responses (the immune component). In fact, the main difference between the infected and immune component distributions is that the former is located below the latter in the A/2009-A/1918 plane. In other words, a person with a certain A/2009 antibody titer likely has some pre-existing immunity if it also has a high A/1918 antibody titer; if it has a low A/1918 titer, it is more likely that the person has been infected by A/2009 virus.  (Table 3). The blue dots indicate that multiple samples are negative to both A/1918 and A/2009 (age-specific numbers in the prepandemic survey: 15, 20, 16, 19; post-pandemic survey: 9, 19, 19, 17). Grey areas indicate the regions with high probabilty that a post-pandemic sample is infected with A/2009. The susceptible component is placed largely placed beneath the detection limit and is not displayed. doi:10.1371/journal.pone.0113021.g003 We further evaluated the diagnostic characteristics of the microarray by analysing classification of post-pandemic sera. In general, classification is most precise in the bivariate microarray ( Figure 4). For instance, in young adults (20-44 years) many sera of intermediate to high antibody titers in the A/2009 microarray (160-640 titer) cannot be classified as infected (estimated infection probabilities range from 30-50% with confidence intervals ranging from ,10% to .70%). Inclusion of A/1918 in the analysis strongly improves classification; samples with low A/1918 antibody titer have estimated infection probabilities of .95% with small confidence ranges, and samples with high A/1918 scores have estimated infection probabilities under 20% (confidence limits range from 20%-40%).
True infection statuses are unknown in the post-pandemic survey, but we can safely assume that pre-pandemic samples do not belong to persons who have been infected with A/2009. We exploit this fact to investigate how many pre-pandemic samples would be misclassified as infected. Each sample in the prepandemic survey has a certain estimated infection probability, and we report the expected number of misclassifications i.e. the infection probabilities cumulated over all positive pre-pandemic samples. The bivariate microarray yields the lowest percentage of misclassifications (8?6 out of 64; 13%), followed by the univariate microarray (18?1 out of 64; 28%), and the HI analysis (16?1 out of 38; 42%).
Overall comparison of classifications is investigated in a receiver operating characteristic (ROC) diagram, taking different cut-off values for positive classification (HI and univariate microarray), or taking different values of the A/1918 to A/2009 ratio for positive classification (bivariate microarray) ( Figure 5). For HI, maximum sensitivity plus specificity are at a cutoff of 44, with sensitivity and specificity of 66% and 51%. The univariate microarray scores higher with sensitivity and specificity of 91% and 84%, at a microarray titer cutoff of 97. The bivariate mixture scores even higher with sensitivity and specificity of 96% and 95%, at a microarray titer ratio of 0?95 (A/1981 to A/2009).

Discussion
Using mixture model analyses of two population-based serological studies [9], we have shown that classification of sera for infection with influenza (A/2009 H1N1) is possible using a recently developed protein (HA1) microarray. Sensitivity and specificity are high in the univariate as well as the bivariate model. In the microarray, misclassification of pre-pandemic samples as infected occurs infrequently, and estimates of infection attack rates are comparable to published figures, with comparable precision even though our sample size is much smaller than in earlier studies [9,16].
Our analyses have uncovered that classification of sera belonging to persons infected with A/2009 (H1N1) works best  Figure 5). When using a more distantly related A/2007 antigen in combination with A/2009 in the bivariate mixture, classification of samples is not improved, the reason being that there is little cross-reactivity between A/2009 and A/2007 antigens (results not shown). Hence, our analyses suggest that the use of additional data works best when using a secondary antigen that is closely related to the focal virus, so that a distinction can be made between the specific responses that are the result of infection, and the correlated but less specific responses that result from earlier infections with other viruses. Whether such combinations of viruses that are antigenically related but not almost identical are available for other subtypes, e.g., H3N2 remains to be investigated.
The microarray measures antibody binding and the observed antibody responses are not necessarily protective. It is known, however, that positive responses in the microarray correlate with protection against infection [5,6]. Furthermore, the microarray analyses are broadly consistent with the analyses based on HI, with the fraction of persons with pre-existing responses increase strongly with age.
In our analyses the estimated susceptible component is placed largely below the detection limit in the HI and microarray analyses, while the immune component still has substantial density below the detection limit (Figures 2-3). This suggests that it may not always be easy to distinguish susceptible persons from those having been exposed before. One question for future studies is whether classification of persons as being susceptible, immune, or infected can be improved by extending the analyses to more than two antigens, or by using larger datasets.
Throughout, we have assumed that the susceptible, immune, and infected component distributions are independent of age. This is done for simplicity and since allowing for age-dependence in the component distributions would lead to identifiability problems, especially in older adults. As it is, the fit of the infected component distribution is strongly informed by children. However, visual inspection of the locations of the pre-and post-pandemic samples across all age groups in the A/2009-A/1918 plane shows that most lie within the regions of high support of the model, i.e. there are very few outliers. This indicates that the model and the fitted mixture model describes the data well, not only in children but also in older age groups.

Supporting Information
Table S1 Number of samples in the earlier survey (see main text), and the subset that has been tested with the microarray. Not eligble for selection were pre-pandemic samples collected after October 11 (42 samples), and post-pandemic samples from persons. (DOCX)