Discriminating severe seasonal allergic rhinitis. Results from a large nation-wide database

Allergic rhinitis (AR) is a chronic disease affecting a large amount of the population. To optimize treatment and disease management, it is crucial to detect patients suffering from severe forms. Several tools have been used to classify patients according to severity: standardized questionnaires, visual analogue scales (VAS) and cluster analysis. The aim of this study was to evaluate the best method to stratify patients suffering from seasonal AR and to propose cut-offs to identify severe forms of the disease. In a multicenter French study (PollinAir), patients suffering from seasonal AR were assessed by a physician that completed a 17 items questionnaire and answered a self-assessment VAS. Five methods were evaluated to stratify patients according to AR severity: k-means clustering, agglomerative hierarchical clustering, Allergic Rhinitis Physician Score (ARPhyS), total symptoms score (TSS-17), and VAS. Fisher linear, quadratic discriminant analysis, non-parametric kernel density estimation methods were used to evaluate miss-classification of the patients and cross-validation was used to assess the validity of each scale. 28,109 patients were categorized into “mild”, “moderate”, and “severe”, through the 5 different methods. The best discrimination was offered by the ARPhyS scale. With the ARPhyS scale, cut-offs at a score of 8–9 for mild to moderate and of 11–12 for moderate to severe symptoms were found. Score reliability was also acceptable (Cronbach’s α coefficient: 0.626) for the ARPhyS scale, and excellent for the TSS-17 (0.864). The ARPhyS scale seems the best method to target patients with severe seasonal AR. In the present study, we highlighted optimal discrimination cut-offs. This tool could be implemented in daily practice to identify severe patients that need a specialized intervention.


Introduction
Allergic rhinitis (AR) affects up to 50% of some populations, especially in "westernized" countries and its prevalence, in France, tripled over the last 25 years and is around 31% [1,2]. PLOS  Current guidelines try to categorize patients by differentiating between those suffering from "mild" symptoms and those with "moderate to severe" forms of rhinitis [3][4][5]. There is not a unanimous way to define patients as simply "severe", even though it seems important to highlight this specific group, because of the consistent burden associated to these patients, in terms of increased morbidity and therefore direct healthcare and indirect socio-economic cost [6,7]. ARIA guidelines look for duration and types of symptoms reported to the physician, and the new allergy diary App for smartphones questions patients on how they feel, through a visual analogue scale (VAS) [4,8]. Therefore, if on one hand, current classification is based on physician's appreciation of the disease as presented by the patient, on the other hand, it highlights the importance of how the patient feels, regardless the items proposed by the guideline. The real goal, both in the "classical" ARIA classification and in the ARIA App for patients' self-evaluation, is to assess the control of the disease, regardless its severity. In fact, AR control implies that patients do not present bothersome symptoms when exposed to allergens, while severe forms characterize patients who are not able to control their symptoms even if an appropriate high-dose treatment is prescribed and their compliance is good.
Several scores have been previously validated to assess AR control, such as the CARAT, the RCAT, the ARCT, and the VAS (both on a pencil-and-paper tool and through smartphones) [2,9]. At the same time, some scores, including symptoms scores and VAS, have been tested or even validated to assess patients' severity and categorize patients [10][11][12][13][14]. A few authors proposed on the other hand to assess severe AR, by analyzing patients in clusters, and stratify them, based on the severity of their symptoms [15][16][17]. Therefore, in literature, severe forms may be identified through physicians' questionnaire, self-assessment methods, or by analyzing the results of published cluster analysis. Besides identifying the best tool to stratify patients, it is debated whether physician's or patient's assessment would serve as the best guide to classify the patient's severity.
The aim of the present paper was therefore to assess the best method to stratify patients suffering from seasonal AR and to then propose cut-offs able to determine which patients suffer from severe forms of rhinitis.

Study design
In a multicenter French study, 36,397 adult patients with a previous medical diagnosis of seasonal AR and consulting a physician were included. All patients were consulting either a general practitioner or an ENT, or an allergist, or a dermatologist, or a pulmonologist. A total of 8,143 doctors distributed over the whole French territory participated to the study. The Polli-nAir study was approved in France in 2005. The approval by an ethic committee was declared as not applicable at that time. Instead of the study was approved by the National Committee for Information Management on medical research (Comité Consultatif sur le Traitement de l'Information en Matière de Recherche dans le domaine de la santé) and by the National Commission on informatics and health (CNIL, Commission Nationale Informatique et Liberté). Information was provided to included subjects or to their caregivers through a written document. Informed written consent to participate in the survey was obtained for all patients by the physicians. The CNIL approved in 2016 that all data acquired prior to 2016, without the previous need of an authorization of an Ethic Committee, could still be exploited. The survey and its methodology have been described in detail elsewhere [10,18].

Collected data
Each doctor interviewed the patients after confirming the previous medical diagnosis of seasonal AR through a clinical visit, and answered 17 questions for each included patient; each question focused on one item, and the physician was supposed to rate every symptom from 0 to 4 in a 5-point Likert scale (0: absent, 1: mild, 2: moderate, 3: severe, and 4: very severe). Evaluated symptoms were: nasal congestion, nasal obstruction, rhinorrhea, nasal itching, sneezing, headache, tiredness, loss of appetite, irritability, lacrimation, eye itching, painful throat, cough, itching throat, earache, alteration of daily activity, and sleep alteration. Other collected data included age, gender, location (rural / urban), disease onset (years before), duration of episode (days), reported history of asthma, conjunctivitis, atopic dermatitis, food allergy, or hives, results of skin-prick tests (SPTs) to respiratory allergens (positive / negative), positivity of serum specific IgE to respiratory allergens (positive / negative), previous or concomitant allergen immunotherapy (yes / no), and region (center, east, north-west, Paris agglomeration, south-east, south-west and west). On the day of the visit, patients completed a Visual Analogue Scale (VAS) on a paper, indicating, on a 10-cm line, how severe they believed their rhinitis was ("how bothersome are your allergic rhinitis symptoms?").

Data and statistical analysis
Five approaches to classify patients according to AR severity were assessed: 1. K-means clustering (KMC) [19] was used as unsupervised classification on standardized variables to categorize AR patients. A group of three clusters were then selected for further analyses.
2. Agglomerative hierarchical clustering (AHC) [20] was used as unsupervised classification on standardized variables to categorize AR patients. A group of three clusters were then selected for further analyses.
3. Allergic Rhinitis Physician Score (ARPhyS), previously described as "Global Symptomatic Score (GSS-20)" [10,18] was calculated, based on five physician-diagnosed symptoms. These symptoms were assessed by each doctor during the interview with the patients. To each nasal (nasal obstruction, rhinorrhea, sneezes and nasal pruritus) and ocular symptom (ocular pruritus), doctors attributed a severity score ranging from 0 to 4, as described above. This score could therefore possibly range from a minimum of 0 to a maximum of 20 points. The score was categorized into three terciles.
4. Total Symptoms Score (TSS-17) [21,22], which is the global score resulting from adding up the evaluation of the 17 items rated by physicians for each included patient. This score could therefore possibly range from a minimum of 0 to a maximum of 68 points. The score was categorized into three terciles. [11,12], which is a global self-assessment wellness score reported by each patient, and ranging from 0 (= no discomfort) to 100 (= maximal discomfort). The score was categorized into three terciles.

Visual Analogue Scale (VAS)
Classification in three groups had been chosen to follow current guidelines on AR severity and their adaptation [3,23,24]. For validation, K-means algorithm and agglomerative hierarchical clustering models were carried out 10 times by the leave-one-out method to ensure stability and repeatability of the models.
Classical statistical methods were used for the analysis [19][20][21][25][26][27]. Discrimination analyses was conducted with Fisher linear and quadratic discriminant analysis, along with nonparametric kernel density estimation methods [28] and allowed to evaluate miss-classification of the patients among the three categories on each of the five scales. Cross-validation check was used to assess the validity of each scale [29]. Reliability of the ARPhyS scale and of the TSS-17 were evaluated through Cronbach's alpha coefficient. At last, Cohen's κ coefficient was computed to study the degree of agreement between the five scales.
All analyses were performed using SAS version 9.4 (SAS Institute Inc, Cary, NC, USA). All p-values <0.05 were considered statistically significant.

Patients' stratification
Out of the 36,397 subjects initially included in the trial, 8,288 were excluded from further analysis, because of missing data. The other 28,109 patients were then categorized in three classes, of "mild", "moderate", and "severe" AR, as shown in Table 1. Stratification by cluster analysis is shown in Fig 1. There were 10,617 patients in the mild, 9446 patients in the moderate and 8046 patients in the severe category based on the ARPhyS scale (Table 1).

Discrimination and cross-validation
When considering all the five approaches, the best discrimination was offered by the ARPhyS scale, followed by the KMC, and then by the TSS-17 and by the AHC, while the VAS produced the worst results (Table 2). For validation, K-means algorithm and agglomerative hierarchical clustering models were carried out 10 times by the leave-one-out method to ensure stability and repeatability of the models. These methods showed 95.6 and 94.8% % repeatability. The ArPhyS scale showed the best results in terms of error rates and cross-validation error rates, as highlighted in Table 2. Based on the ARPhyS stratification in mild, moderate and severe symptoms, the characteristics of the included patients are shown in Table 3. The duration of the rhinitis episode did not have a statistically significant impact on the ARPHyS scale, despite the large sample size. The proportion of rural population was significantly higher in the severe category compared to the mild category by approximately 3%; no statistically significant difference was noted in gender distributions across the categories ( Table 3). The number of patients presenting with a history of conjunctivitis, asthma, atopic dermatitis, food allergy, and hives, and with positive SPT or specific IgE or with a previous or concomitant allergen immunotherapy significantly increased with severity.

Reliability, agreement and stratification
Score reliability, assessed through Cronbach's α coefficient, was acceptable (0.626, computed on the original raw scores) for the ARPhyS scale, and excellent for the TSS-17 (0.864). Maximum variability was observed in the first canonical component direction explaining 99.8% of the total variability for the ARPhyS scale. As for agreement between scores, VAS showed the lowest agreement if compared with all the other scores, as shown in Table 4.
In order to choose cut-offs able to properly stratify patients, based on the ARPhyS scale, we identified those values that would be best associated to equivalent previously highlighted  terciles: we propose therefore cut-offs at a score of 8-9 for mild to moderate symptoms and of 11-12 for moderate to severe symptoms. To summarize, patients were classified as presenting with "mild" symptoms if they scored 0 to 8 with the ARPhyS scale; they had "moderate" symptoms if they scored 9 to 11; they should be considered as "severe" whenever they scored 12 to 20 (Table 5).

Missing data analysis
When analyzing data from the 8,288 patients excluded from the cohort, because of missing data, no significant statistical difference was highlighted when considering sex, age, history of conjunctivitis, asthma, atopic dermatitis, food allergy, hives, and positive SPT. Significant difference was on the contrary pointed out when evaluating other items: proportion of rural population included in the analysis was significantly higher than those who were not included due to missing data (43.4% vs. 39.9%; p < 0.001); same considerations as for positive specific IgE (11.9% vs. 11.0%; p = 0.036); at last, the included population underwent previous or concomitant allergen immunotherapy almost significantly more than the excluded group (7.9% vs. 7.3%; p = 0.048).

Discussion
When dealing with patients suffering from allergic rhinitis, physicians should be able to easily detect those presenting severe forms. In fact, severe AR patients may often suffer from comorbidities, may need more drugs to control their symptoms, and may present an impaired quality of life, besides being at risk for an increased loss of productivity and absenteeism [2]. The cost  Table 5. The ARPhyS score, with cut-offs level to identify patient's severity.

ARPhyS
Please, rate the severity of each of the following symptoms as presented in this moment by your patient:  [7]. Therefore, it seems important to be able to promptly recognize severe forms to provide patients proper efficient treatments. In recent years, cluster analysis has become more and more common to identify subgroups of patients: it consists in applying unsupervised statistical methods to a population with a wide distribution of related symptoms, and then identifying possible homogeneous phenotypes, with minimum overlap between each other [15]. In a work by Burte et al., the authors highlight three different clusters of rhinitis (allergic and non-allergic), from a population of 983 adults, but they do not differentiate them based on severity [17]. In a work by Bousquet PJ et al., on the contrary, the authors identified clusters of severe AR, in a population of 990 patients, and then compared them with the ARIA classification [15]. They found that, in real-life, physicians prescribe a therapy, with no regard to nasal symptoms severity [15], and therefore current guidelines and proposed cluster do not help general practitioners in stratifying the severity of patients presenting with AR. We identified three clusters, through two different methods, in a population of 28,109 patients. Clusters showed no overlap between each other. After evaluation through Fisher linear and quadratic discriminant analysis, and non-parametric kernel density estimation methods, we found that cluster analysis does not provide the best results in terms of discrimination, error rate and cross-validation, if compared to other assessed methods (Table 2).
Visual Analogue Scales, on the other hand, have been used for several diseases in recent years. They have been tested and validated for AR, even on smartphones screens [9]. This approach, advised by the novel ARIA guidelines, is useful to assess symptoms control and quality of life impairment, but also severity in patients suffering from AR [8,12]. In fact, a recent paper by Del Cuvillo et al. showed in a population of 3,572 patients that a VAS greater than 7 cm is a reliable score to identify severe patients (Negative Predictive Value, NPV, of a VAS above 7: 98.6%; Positive Predictive Value, PPV, at 7: 20.4%) [12]. A previous paper by Bousquet PJ et al., on a cohort of 3,052 patients, proposed a 5-cm cut-off for mild forms, while moderate to severe patients were to be considered for a VAS of over 6 cm (with a NPV of 56.5% and a PPV of 94.3%) [11]. The two papers found two different cut-offs, but in the study by Bousquet PJ et al., the authors only differentiated "mild" patients from "moderate/severe", while the study by del Cuvillo categorized patients into three severity groups (mild, moderate, and severe). On the other hand, in a paper by Rouve et al., it seemed that categorizing AR severity in patients through VAS brought to an exaggerated inclusion in the severe group [10]. In the present paper, we found that VAS proved the worst results in terms of discrimination and cross-validation (Table 2), and the least agreement in terms of results if compared with the other 4 tested methods (Table 4). Considering our findings from a very large cohort of patients, we may speculate that VAS is a useful tool for diagnosis and assessment of disease control, for both patients and physicians, but not the best tool for classifying patients according to severity by physicians. Another possible explanation of the discrepancy between our results and those from del Cuvillo and Bousquet PJ is that we only included patients suffering from seasonal AR, while the previous authors did not use such selection criterion.
In contrast with the ARIA guidelines, we found that the duration of the disease does not have a significant impact on the severity of symptoms, and such result confirms what had been previously stated by Bousquet PJ et al. [15]. Rather than differentiating the disease between intermittent and persistent, attention should be focused on nasal and ocular symptoms, based on our findings. In a study by Valero et al., the authors evaluated a TSS-4 to stratify AR patients, based on the clinical items proposed by the ARIA guidelines [14], and in a following study the authors validated the TSS-4 as a tool to assess severe forms [13]. Through these papers, Valero et al. underlined that total symptoms scores seem to be practical methods for physicians to target severe patients, following current ARIA guidelines. Based on such considerations, in our study, even though the ARPHysS scale showed a Cronbach's alpha-coefficient inferior to the TSS-17, we chose the first questionnaire because, besides being superior as a discriminating method, it is also quicker and easier to use in everyday clinical practice. Also, the ARPhyS scale (Table 5) allowed us to identify tertiles that maximally correlated in the first canonical dimension and therefore to propose simple cut-offs to categorize AR into "mild", "moderate", and "severe". [11]. On the other hand, we really wanted to highlight the importance of identifying severe patients, since mild ones usually do not even consult their general practitioner for rhinitis symptoms, while severe may need a specialized approach in order to control their symptoms and comorbidities.
A possible limit of the study is that patients visiting physicians were only evaluated during spring and summer seasons, which might limit the generalizability of our results to patients visiting during autumn and winter seasons. However, the large sample size provides robustness in the results and we were able to propose a practical tool for physicians, which is fast and obtained from a real-life study.
The ARIA classification of AR severity is useful, especially to differentiate mild patients from the others. Several tools have been developed for physicians to assess severity and control in recent years. Such tools need to be easy-to-use and efficient, and, so far, only a few of them have been validated. Through the present study, we evaluated two different tools to assess AR severity: one composed by 17 questions and the other one, the ARPhyS scale, by 5 questions. When comparing our tool to the other tested methods, we found that the ARPhyS scale is the best in terms of discrimination and cross-validation. Also, it is an easy tool for physicians and we found some cut-off values able to differentiate mild patients from moderate, and from severe ones. Such tool could therefore be implemented in daily practice to identify severe patients that need a specialized intervention or anyway a more important therapeutic treatment.