Serum IgE Reactivity Profiling in an Asthma Affected Cohort

Background Epidemiological evidence indicates that atopic asthma correlates with high serum IgE levels though the contribution of allergen specific IgE to the pathogenesis and the severity of the disease is still unclear. Methods We developed a microarray immunoassay containing 103 allergens to study the IgE reactivity profiles of 485 asthmatic and 342 non-asthmatic individuals belonging to families whose members have a documented history of asthma and atopy. We employed k-means clustering, to investigate whether a particular IgE reactivity profile correlated with asthma and other atopic conditions such as rhinitis, conjunctivitis and eczema. Results Both case-control and parent-to-siblings analyses demonstrated that while the presence of specific IgE against individual allergens correlated poorly with pathological conditions, particular reactivity profiles were significantly associated with asthma (p<10E-09). An artificial neural network (ANN)-based algorithm, calibrated with the profile reactivity data, correctly classified as asthmatic or non-asthmatic 78% of the individual examined. Multivariate statistical analysis demonstrated that the familiar relationships of the study population did not affect the observed correlations. Conclusions These findings indicate that asthma is a higher-order phenomenon related to patterns of IgE reactivity rather than to single antibody reactions. This notion sheds new light on the pathogenesis of the disease and can be readily employed to distinguish asthmatic and non-asthmatic individuals on the basis of their serum reactivity profile.


Introduction
Asthma is one of the most common diseases affecting both adults and children and accounts for up to 300 million [1] cases worldwide. Worryingly, its frequency has increased annually during the last five decades [2,3]. Both genetic (cytokines and immune response genes) [4,5], developmental and environmental factors (viral infections [6], allergens [7] and occupational exposures [8] have been associated with asthma susceptibility, age of onset and severity. Although the pathogenesis of the disease has not been fully elucidated yet, a major risk factor is the development of immune responses to foreign antigens, that are characterized by the production of antigen-specific IgE [9]. This notion has been first inferred from observations showing that the prevalence of asthma was closely related to the serum IgE level standardized for age and sex [10]. Overwhelming evidence has confirmed the role of IgE in atopic asthma, while several studies have also revealed a link between IgE and non-atopic asthma [11]. More controversial is the role of antigen specific IgE in determining the onset and severity of the disease. Several studies have unraveled strong relationships among exposure to house dust mite (HDM), the presence of serum IgE directed against the mite allergens, and asthma [12]. However, a large number of individuals worldwide, particularly those living in some regions of USA and Scandinavia, have low lifetime exposure to mite antigens, but do not show any decrease in the prevalence and the severity of asthma [13]. Therefore, other antigens -either alone or in combination-ought to have the ability to elicit an IgE response and play a role in the pathogenesis of the disease. Indeed the links among antigen exposure, IgE production, and occurrence and/or severity of asthma seem to involve an unexpected number of factors, and a nonlinear relationship between exposure and response appears to exist [14]. To date, studies of the association between specific IgE and asthma have focused on analyzing either one or a few antigens at a time, like for example those describing the role of HDM [15][16][17][18]. The disproportion between the repertoire of known allergens and the number of antigens that have been analyzed may well explain the difficulties encountered in establishing the role of specific IgE in the pathogenesis of asthma.
We generated a microarray containing a vast repertoire of allergens (103) that forms the substrate of an antibody-capture assay to investigate the IgE reactivity profiles of 872 individuals belonging to families with documented history and diagnosis of asthma and atopic diseases. Then, we searched for associations between IgE reactivity profiles and atopic diseases including asthma, rhinitis, conjunctivitis and eczema in a case-control and parent-to-siblings study. Multivariate analysis was carried out to assess the effect of family relationships on the statistical analysis. The results of the IgE reactivity profiles were utilized to develop and validate an artificial neuronal network classifier capable of distinguishing asthmatic and non-asthmatic individuals with high accuracy.

Population case study
The sample consisted of a total of 872 sera, including 442 parents and their progeny (430 individuals) ( Table 1). Within the study group, 428 children and 57 parents (55.62% of the total) were diagnosed with asthma, 342 parents (39.22% of the total) were classified as non asthmatic, though some of them suffered from atopy related disorders such as rhinitis, conjunctivitis and eczema, a remaining 5.16% were classified as undefined asthma diagnosis. Atopic asthmatic sibling pairs (sibs) and trios were collected over a period of 4 years, mainly from pediatric and pneumological centers. All patients were of Sardinian origin for at least 3 generations and their age at visit was above 6 years to avoid subjects with transient symptoms. At the recruitment sessions, each subject was interviewed, disease status ascertained by physical examination, permission asked to access personal health records, and blood samples were collected. Each participant signed an informed consent form approved by the local ethics committee (Azienda Sanitaria Locale number 8 protocol 24/Comitato Etico/ 02, authorization number 4737). Asthma was diagnosed by a pulmonary physician, in accordance with the American Thoracic Society criteria [19]. Pulmonary function was evaluated by spirometry. A physician administered a questionnaire collecting clinical history and classifying asthma severity in four levels according to the World Health Organization guidelines (Global Initiative for Asthma). The use of asthma drugs and any other medication was recorded. Atopy was detected by positive skin testing to common inhalant allergens by standard methods. Patients with history of early asthma onset were interviewed by a physician about persistency of asthma symptoms after the completion of puberty (18 years).

Development of IgE microarray immunoassay
The serum IgE reactivity was analyzed using a fluorescence immunoassay that incorporates as a substratum a microarray of 103 allergens (Table S1) including 95 extracts and 8 recombinant proteins representative of 11 distinct allergen classes chosen amongst those most frequently associated with atopic diseases in Southern-Central Europe. Comparisons performed between the ELISA and the microarray assay for 6 common allergens, revealed that the overall diagnostic performance, as defined by clinical sensitivity and specificity, of the microarray immunoassay was very good [20]. To generate the array the allergens were printed onto aldehyde-activated glass microscope slides in duplicates at randomised positions in the array. The immunoassay procedure consisted of four phases (printing, processing, scanning, quantification and analysis). Two chips of 103 allergens each, were printed onto each slide using high-speed robotics (Microgrid Compact; Biorobotics). Allergens (Allergopharma) were spotted onto the arrays in the following spotting buffers: PBS pH 7.4, glycine pH 2.4, Borate pH 9.4, glycerol 10%, DTT 5 mM, SDS (0.2%; 0.05%), Tween 20 (0.01%) at a spotting concentration ranging from 0.008 to 3 mg/ml. Bound IgE were revealed incubating the slides first with serum sample (100 ml-60 minutes at 37uC), followed by a secondary mouse monoclonal antibody directed against human IgE (100 ml-45 minutes at 37uC), followed by anti-mouse IgG HRP conjugated antibody (100 ml-45 minutes at 37uC) and finally incubated with tyramide-Alexa 555 (100 ml-15 minutes at 37uC). The ScanArray TM software provided by Perkin Elmer Life Sciences Inc. was used to scan the slides and to acquire the fluorescence. After subtracting the background signal the concentrations (IU/ml) of allergen-bound IgE were determined by interpolating the signal with an internal calibration curve printed onto each microarray. To assign IU/ml values to the calibration curve, we used an external Reference Curve generated by microarray slides printed with replicates of Goat anti-Human IgE and incubated with increasing concentrations of human IgE (WHO Reference standard 0.35, 1.0, 3.5, 10.0, 50.0 IU/ml). The signal collected from the allergens was interpolated with the calibration curve to obtain the IU/ml value, and translated into a Class Score by plotting the data in a standard reactivity scale. Class Score values: (CLASS 0 (less than 0.35 IU/ml); CLASS 1 (0.35-0.7 IU/ml); CLASS 2 (0.71-3.5 IU/ml); CLASS 3 (3.51-17.5 IU/ml); CLASS 4 (17.51-50 IU/ml); CLASS 5 (50.01-100 IU/ml).
The microarray data has been submitted to Gene Expression Omnibus -GEO-NCBI, accession numbers GSE20020; Platform GPL9968 ''Allergochip for asthma diagnosis'' A detailed description of the array procedure is described in the supporting information (Text S1).

Statistical analysis
Reactivity data from each serum were quantified in IU/ml, transformed into class score and encoded with 103-dimensional vectors (1 dimension for each allergen) using Cluster 3.0 [21] to generate individual profiles. We used k-means clustering to group sera into distinct clusters on the basis of similarities in their reactivity profiles, while MapleTree [22] was used for visualizing the clustering results. Clustering is a statistical technique for collecting objects into a fixed number of groups (clusters) so that each group contains only similar objects. In this work we used k-means clustering, a clustering technique where the target number of clusters (k) is user-defined. In clustering, similarity is determined by attributes associated to each object. In this case, each object is a serum, and its attributes are the reactivity data derived from 103 allergens; k-means clustering was therefore aimed at grouping 872 sera into k clusters, so that each cluster would contain only sera with similar reactivity data (derived from their 103 allergens). To assess whether two objects (sera) are similar, which is needed to collect them in the same cluster, their attributes (103 allergen reactivity values) must be compared: this is achieved by defining a similarity metric, i.e. a mathematical formula that combines attribute values of two objects into a meaningful measure of their similarity. In this work, Euclidean distance was adopted as similarity metric, i.e. the square root of the sum of the squared differences of each attribute. The smaller the Euclidean distance, the more similar the two objects. The k-means clustering algorithm works by iteratively inserting each object (serum) into a tentative cluster, and by refining the insertion process until the similarity of the objects within each cluster is maximised. In this case, convergence to an optimal solution was achieved within 10,000 iterations of the clustering algorithm. The disadvantage of k-means clustering is that the number of clusters (k) is user defined. Therefore, to determine what value of k would produce optimal results, we first ran multiple clustering analyses with different k values (from 1 to 14 and 20) and then we analysed the clustering results using indicators that measure how similar the objects are within each cluster, and how different the clusters are with each other. We used five performance indicators (Silhouette index, Dunn index, Davies Bouldin, C-index and Isolation index), as provided by the Machaon software [23], to validate the statistical significance of each partitioning attempt; we determined that K = 3 was the most significant result overall.
Then, we proceeded to determine whether the clustering results would contain significant information also concerning associations with age, sex, and the presence of a pathological condition (asthma, rhinitis, etc.), its persistency and/or severity, age at onset, etc. Statistical tests such as Pearson's x2 and Kruskall-Wallis nonparametric test were run within the SPSS software and Excel to investigate whether the frequency of a pathological condition differed significantly in the clusters and whether each cluster significantly differed from the study population taken as a whole. In particular, the x2 test was used for analysing binary attributes (i.e. asthmatic vs. non-asthmatic), while the Kruskall-Wallis test was performed on discrete numeric variables (such as the age of asthma onset).

Artificial neural network asthma classifier
An artificial neural network (ANN), known as the radial basis function (RBF) [24] was adopted to implement the asthma classifier. Professional software applications, which have modules specifically dedicated for ANN, such as the RBF (Radial Basis Function Algorithm -SPSS 17.0.) were utilized for developing the classifier. The neural network analysis was accomplished by using as input data, a sample data set that includes 51 allergens and the sera reactivity profiles of 827 individuals. Within the sample, 485 are asthma positive individuals, and 342 negative. To evaluate the actual improvement that was achieved by reducing the number of allergens to be considered by the classifier (by means of the Mann-Whitney test, as illustrated previously), a separate RBF was trained on the complete set of allergens, and then its results compared with the RBF operating on the filtered set. The sample was first randomized and the size of the training sample used was about 60% of the entire population, while the remaining individuals were left for validation purposes (testing 10% and holdout of 30%). The training subset was selected using randomization criterion that ensures the representativeness of the sample with respect to the entire population. This is to ensure that the RBF replicates a behaviour that is representative for the whole population. The whole supervised training process was repeated 10 times, each time on a new, previously untrained network, and each time with a new randomized subset of the original population. This is to observe any variation in performance, which may be linked to variability in the representativeness of the training sample. There are three layers in the RBF network (Input, RBF and output layer). There are many types of radial basis functions; we used the Normalized RBF (NRBF). To have an estimate of the real efficiency of the neural network, the neural network has been run 10 different times. And the overall efficiency of the Network is given as the mean of the 10 different trials.

Generalized Estimating Equations (GEE) analysis
The results were processed with Generalized Estimating Equations (GEEs) [25] to test for associations between asthma and IgE reactivity taking into account for familiar relationships between asthmatic and non-asthmatic individuals. GEEs are a semi-parametric regression technique useful for fitting the parameters of a model where unknown correlation between variables may be present. The GEE probit analysis is similar to probit regression models (appropriate for dichotomous dependent variable and a set of explanatory variables), but it also takes into account relationships amongst members of the same cluster, in this case parent to sibling relationships. In this study, a ''working'' correlation matrix for the clusters, which models the dependence of each observation with other observations in the same cluster, was specified. The familial correlation was modelled assuming exchangeable correlation for the working correlation matrix, i.e. all measurements on the same cluster are equally correlated. Regardless of the specification of the correlation, GEE models are robust to the misspecification of the correlations structure. Additionally, we selected robust standard errors (sandwich estimators as opposed to conventional standard errors) that allowed the estimates to be valid even in the event of misspecification of the correlation structure.
The association between asthma and specific IgE against single allergens was first studied by GEE univariate analyses, determining possible significant explanatory variables to be included in the model runs. Multicollinearity among potential explanatory variables was investigated using regression diagnostic capabilities to ensure no model-burdening correlations exist between variables. In these analyses specific IgE against single allergens were regarded as factors with 4 levels, since class scores 5 having low frequencies were unified to class score 4. Finally, in order to identify variables predisposing independently to asthma, we performed a multivariate logistic regression model for those variables found from the univariate analysis to have P,0.05, using a forward modelling strategy and the corresponding odds ratio, confidence interval and statistical significance were calculated for each class score of specific IgE in the model. Age, gender and the interaction between these variables were included as covariates. The GEE was run in SPSS v16.0.

Results
We have investigated the IgE serum reactivity of 872 individuals belonging to 283 Sardinian families in which all the progeny (1 to 3 siblings) was affected by atopic asthma ( Table 1). The individuals enrolled in this study included the two parents and their siblings, mostly children below the age of 27 (75%). The rationale of using this particular cohort originates from the notion that asthma has both genetic and environmental components that are not well defined yet. A sample study comprised of families (parentschildren) from a genetically isolated population like the Sardinians offers a unique opportunity to minimize possible sources of genetic heterogeneity and to reduce difference in allergen exposure. The immunoassay was calibrated (using an internal standard curve) to measure the amount of specific IgE binding to each of the arrayed antigens (Table S1) ranging from 0.35 IU/ml to 100 UI/ml. The reactivity values in UI/ml were converted into class scores using a validated 0-5 scale [20]. This approach generated 872 distinct IgE reactivity profiles and an excess of 90,000 antibody-antigen determinations. A color-coded digital profile (from black to increasing intensity of red) matching the IgE class score (0 to 5) against the arrayed allergens was generated for each serum (Figure 1).
We employed k-means clustering [26], a partitioning method commonly used to identify group structure within microarray data to investigate the structure of the reactivity profiles. A significant amount of literature work addresses clustering as a valid statistical method to study asthma [15]. The clustering algorithm was run with different values of k (from 3 to 14 and 20) to split the profiles into different groups. Statistical analysis computed with Machaon software [22], showed that k = 3 is the partition value that, by arranging the profiles into three clusters, maximizes intra cluster similarities and inter cluster differences (Table S2). To validate the capability of the clustering analysis to separate asthmatic and nonasthmatic reactivity profiles, we assessed the frequency of asthmatic and non-asthmatic individuals in the three clusters in a sample consisting of 114 individuals chosen amongst the parents equally divided between asthmatic (cases) and non-asthmatic (controls) individuals. Furthermore, cases and controls were selected so that matched pairs could be formed in terms of age and sex (Table S3). This analysis showed that the distribution of cases and controls significantly differed in two out of three clusters. While cluster 2 did not show differences in the number of cases and controls, both cluster 0 and 1 were enriched with nonasthmatic and asthmatic individuals respectively (p,0.01 in the Pearson's x 2 ) thus indicating that cases and controls had distinct reactivity profiles against the arrayed allergens (Table S4). We investigated how members of the entire 283 parent-siblings cohort pairs were distributed in cluster 0, 1 and 2 stratified according to the presence of atopic diseases such asthma, conjunctivitis, eczema, rhinitis and other traits (age, sex, disease persistency and severity). This analysis indicated that the three clusters were significantly different in terms of frequency of asthma (p = 2.95E-12), conjunctivitis (p = 6.81E-10), eczema (p = 1.10E-03), rhinitis (p = 2.35E-07) and sex ( Table 2). In agreement with the casecontrol analysis, cluster 1 showed an impressively higher proportion of asthmatic individuals (83%), if compared to the other two clusters ( Figure 1A), and also with respect to the entire study sample (x 2 = 33.480, p = 7.19E-09) ( Figure 1B). Similarly significant associations could also be observed for conjunctivitis and rhinitis with cluster 1. The partitioning of the profiles highlighted also an unequal distribution of the familiar nuclei (Table S5). While cluster 1 contained a significantly high proportion of affected children, but very few parents, clusters 0 and 2 were enriched for members of the same families and showed a similar percentage of asthmatic and non-asthmatic individuals. We reasoned that family members segregating in these two clusters showed a similar IgE recognition profile irrespectively of asthma, possibly because of the common exposure to particular sets of the arrayed allergens. The array was designed without a detailed knowledge of the exposure to allergens of the study population and without any a priori assumption on the role of particular allergens in eliciting an IgE response associated with asthma. It is therefore not surprising that some allergens are rarely recognized while others show similar percentages of reactivity in asthmatic and nonasthmatic individuals. The reactivity against these allergens contributes to the formation of the profile and could represent a source of ''background noise'' that masks relevant associations with the asthma status. To address this problem we attempted to generate new profiles using only the allergens that individually showed some differences in the IgE reactivity between asthmatic and non-asthmatic individuals using the Mann-Whitney U test at a threshold of p,0.05. This analysis generated a list of 51 relevant allergens (Table S6) that was employed to generate new profiles and perform clustering association analysis at k = 3. The new clusters (cluster 3, 4 and 5) revealed a striking increase in the statistical significance of the distribution of asthma amongst the clusters when examining the case control groups (p = 3.26E-3) (data not shown) and the entire parent-siblings cohort (p = 3.37E-52) as well as in the other examined atopic conditions (Table S7). Cluster 4 showed high similarity to cluster 1 in terms of both profile structure and composition containing a significantly high proportion of asthmatic sera (x 2 = 35.145, p = 3.06E-09) (Figure 2). The other two clusters differed substantially from those generated with the complete set of allergens. Cluster 5 was significantly enriched with the reactivity profiles of most of the asthmatic individuals not included in cluster 4 (x 2 = 22.958, p = 1.65E-06), whereas cluster 3 contained most of the non-asthmatic individuals (x 2 = 31.172, p = 2.36E-08). A very strong association could also be observed in cluster 4 with both conjunctivitis, and rhinitis compared to the other clusters. Notably, the clusters generated with the filtered set of allergens did not show a significant cosegregation of family members (Table S8). Cluster 4 and 5 (containing nearly all the asthmatic individuals) shared some common features in their IgE reactivity profile ( Figure S1). Twenty out of the 51 allergens (mainly inhalant) were recognized by the sera of the two clusters but those of cluster 4 also reacted against nine allergens mainly derived from the food and grass (allergen 19-23 and 27-30 of Table S6). Notably, cluster 4 showed a higher proportion of individuals with diagnosis of severe asthma (severity class 3 and 4) compared to all other clusters (Table S7) and to the population study sample ( Figure 2B).
The unusually strong association linking some IgE reactivity profiles to asthma prompted us to generate an artificial neural  network (ANN) classifier designed to discriminate between asthmatic and non-asthmatic individuals on the basis of the serum reactivity profiles. Each profile used in the supervised training contained information concerning the reaction values for the 51 filtered allergens, and the health status of the individual with respect to the condition of asthma. The ANN correctly classified 82% of the asthmatic patients as ''asthmatic'' and about 72% of the non-asthmatic as ''non-asthmatic'' ( Figure 3). The overall performance of the ANN was consistent with the results obtained by cluster analysis: the average percentage of asthmatic patients correctly recognized by the RBF classifier as asthmatic is nearly identical to the combined percentage of the asthmatic patients present in clusters 4 and 5. To assess the performance of the ANNbased approach with respect to different classification solutions, we utilized binary logistic regression (BLR). We observed that there was no discrepancy in classification outcome between the two models (data not shown). Finally, we employed a GEE model [27] to investigate whether the presence of familiar nuclei in the study groups had introduced a bias in the composition of the profiles or had underestimated the standard errors of the analysis. When taking into account the parent to sibling relationships, GEE allows the measurement of population-averaged effects as opposed to cluster-specific effects. We used the GEE univariate analyses to determine association between asthma and IgE reactivity against individual allergens (Table S9). This approach generated a list of 43 relevant allergens with p values ,0.05 that coincided substantially with those generated with the Mann-Whitney analysis (forty out of the 43 allergens were identical; 93.02%). To unravel association between distinct reactivity profiles and asthma, a forward multivariate GEE analysis was applied. In this analysis seven allergens (Table 3) were independent predictors of asthma status, while controlling for age, sex and the interaction between age and sex as confounding variables. The final GEE model, factoring in the effect of parent to sibling relationships distinguished asthmatic and non-asthmatic individuals with high accuracy, based on their of the IgE serum reactivity against a reduced number of relevant allergens. The model correctly classified 88.8% of the asthmatic patients as ''asthmatic'' and 90.9% of the non-asthmatic as ''non-asthmatic''. Six out of seven allergens are in common to the 51 relevant allergens generated with the Mann-Whitney procedure. To assess the performance of the GEE-based approach the reactivity profiles against the 7 relevant allergens were utilized to perform clustering association analysis at k = 3 (Figure 4). The analysis of the new clusters (6, 7 and 8) showed a striking increase in the statistical significance of the distribution of asthma amongst the new clusters (x 2 = 192.549, p = 1.54E-42) ( Table S10). The clusters generated with the filtered set of 7 allergens did not show a significant cosegregation of family members with respect to those generated with the all set of allergens (Table S11).

Discussion
A complete understanding of the combination of allergens differentially recognized by asthmatic and non-asthmatic individ- uals would be immensely beneficial for elucidating how specific IgE contribute to the pathogenesis of the disease but progress in this area has been slow because traditional immunoassays such as RAST, CAP ELISA, while useful in assessing specific immune responses, allow for the analysis of just one or few allergens at a time. To overcome these limitations we utilized a microarray immunoassay technology to analyze the IgE reactivity against a vast number of natural and recombinant allergens in the sera of asthmatic and non-asthmatic individuals. This methodology has already been shown to be useful in the serodiagnosis of allergy, infectious diseases, and may soon replace ELISAs in clinical laboratory settings [25,26,27,28]. It has been widely recognized that the analysis of the reactivity profiles not only provides a large amount of quantitative information (the sum of the individual reactivity), but also generates a higher order of knowledge in term of unique combinations of antigen-antibody reactions that associate with different experimental and pathological conditions [26].
Using k-means clustering the serum reactivity profiles of all individuals analyzed were arranged in three clusters that, significantly differed from each other in terms of the combination of allergens recognized and in the proportion of individuals affected by different atopic diseases both amongst the case-control groups and the parentsiblings pairs. In particular, asthmatic individuals contributed to 83% of the reactivity profiles of cluster 1. This percentage showed a remarkable statistical significant difference (p,10E-9) compared to that of the asthmatic individuals in the study population. While, cluster 0 and cluster 2 did not show significant differences in the proportion of asthmatic and non-asthmatic individuals, the analysis of their composition demonstrated that, in contrast to cluster 1, they were enriched in members of the same family nuclei. We thought that the composition of cluster 0 and 2 reflected the exposure of family members to a common set of allergens that were included in the microarray that though having a powerful sensitizing ability, were not relevant for asthma and in addition could be due to cross-reactivity [29]. Accordingly, we identified amongst the 103 arrayed allergens a subset of 51 allergens that most differed in the IgE reactivity of asthmatic and non-asthmatic individuals. These allergens were used to generate new reactivity profiles that were clustered using k-means.
The new clusters showed some interesting and novel features. Cluster 4 was very similar to cluster 1 in terms of IgE recognition pattern and the high proportion of asthmatic individuals. The two remaining clusters also showed highly significant statistical differences in their composition. Cluster 3 was enriched in non-asthmatic individuals (69%) while cluster 5 contained a high percentage of asthmatic patients (82%). The distribution of rhinitis and conjunctivitis (but not eczema) in the clusters closely mirrored that of asthma in agreement with other observations that have established a link amongst these atopic diseases [30,31]. As anticipated, we did not observe in cluster 3, 4 and 5 any enrichment in members of the same family nuclei thus indicating that the corresponding profiles are associated with the asthma clinical status rather than to allergen exposure. The two clusters that contained the highest proportion of asthmatic individuals (cluster 4 and 5) reacted against the same subset of 21 allergens though cluster 4 showed an additional specific reactivity against nine allergens mainly of food and grasses origin. Notably cluster 4 but not cluster 5 showed an association with asthma severity thus unraveling an unsuspected link between disease severity, on one side and the complexity and the specificity of the IgE response on the other one.
This result is consistent with the notion that quantification of IgE antibodies may serve as a marker of severity of asthma [32] and that not only IgE antibodies levels but also the number of allergens reacting positively when tested are needed for the correct classification of disease [33]. GEE statistical analysis showed that the presence of family relationships amongst the individuals enrolled in this study did not account neither for an unequal partitioning nor for an underestimation of standard errors. The GEE multivariate analysis could distinguish asthmatic and non-asthmatic individuals with high accuracy, based on their of the IgE serum reactivity. In this analysis seven allergens were independent predictors of asthma status. Association studies performed on clusters using 7 relevant allergens revealed that the GEE-based approach and the k-means analysis based on the Mann-Whitney selection of allergens gave the same results. Timothy grass is a highly prevalent grass worldwide and the prevalence of allergen-specific IgE resulted 8.1-34.6% in the European Community Respiratory Health Survey [34]. Recent studies showed the efficacy of timothy grass allergy immunotherapy tablet treatment in improvement in asthma symptoms both in North American adults and children [35,36].
The last two relevant allergens included the alpha amylase and the kiwi. The alpha amylase is generally associated with Bakers' asthma, the most common occupational respiratory disease, caused by occupational exposure to the antigens from flour. Interestingly, a potential association between respiratory allergy to cereal flour and allergy to kiwi fruit has been recently disclosed. Cross-reactive carbohydrate determinants and thiol-proteases homologous to Act d1 (cysteine protease) are responsible for wheat-kiwi cross-reactivity in some patients [37].
While this study demonstrates a link between specific IgE and asthma it should be emphasised that a comprehensive under-standing of the relationship between asthma and allergen specific IgE would require an exhaustive analysis of reactivity profiles in populations exposed to different set of allergens. The allergens required for such findings would probably vary depending on both subject's age and geographical area. In this context a major advantage of microarray immunoassays is that the composition of different sets of allergens can be expanded and improved continuously, facilitating the identification of the most appropriate reactivity data profile for asthma diagnosis and hence favouring preventive medicine and curative therapies both for asthma and allergic diseases.
Our data demonstrate that associations between asthma and IgE antibody responses to single allergens dramatically underestimate the underlying similarities and differences in individual reactivity to the allergen repertoire that may be relevant for understanding the causes, the severity and the progression of the disease. This also explains why making associations between antibody responses and disease is hard to identify. On the contrary, by analyzing the IgE serum reactivity profiles against a large set of allergens, we could demonstrate that asthmatic and non-asthmatic individuals differ dramatically in terms of number and class of recognised allergens. This information was utilised to train an ANN capable of distinguishing asthmatic and nonasthmatic individuals with high accuracy based on their IgE serum reactivity. This work provides a new framework for understanding the role of allergen specific IgE in the pathogenesis of asthma, thus helping in explaining the occurrence of acute episodes in the apparent absence of exposure to a single allergen, and will prove invaluable in implementing preventive and therapeutic measures.

Supporting Information
Text S1 Development of IgE microarray immunoassay. (DOC)         Figure S1 Visual representation of the reactivity patterns of cluster 3, 4 and 5. The numbers on the x axis correspond to the asthma relevant allergens (list on Table S6), the y axis shows the corresponding class score serum reactivity (0-5) in term of mean value red circles), the interquartile range (IQR-blue vertical lines) and the medians (horizontal lines of the IQR). (DOC)