Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Construction of an intelligent screening model for allergic rhinitis based on routine blood tests

  • Change Fan ,

    Roles Project administration, Writing – original draft, Writing – review & editing

    384370060@qq.com

    Affiliation Hohhot First Hospital Allergy Center, Hohhot, China

  • Yanan Wang,

    Roles Data curation, Methodology, Writing – original draft, Writing – review & editing

    Affiliation Inner Mongolia Zhihui Big Data Research Institute, Hohhot, China

  • Xin Tong,

    Roles Data curation, Writing – original draft

    Affiliation Hohhot First Hospital Allergy Center, Hohhot, China

  • Shiyu Wu,

    Roles Data curation

    Affiliation Inner Mongolia Zhihui Big Data Research Institute, Hohhot, China

  • Caiyan An,

    Roles Data curation

    Affiliation Hohhot First Hospital Allergy Center, Hohhot, China

  • Huijiao Cai,

    Roles Data curation

    Affiliation Hohhot First Hospital Allergy Center, Hohhot, China

  • Junjing Zhang,

    Roles Project administration

    Affiliation Hohhot First Hospital Allergy Center, Hohhot, China

  • Biao Song,

    Roles Formal analysis, Project administration

    Affiliation Inner Mongolia Zhihui Big Data Research Institute, Hohhot, China

  • Ruihuan Zhang

    Roles Data curation, Methodology

    Affiliation Inner Mongolia Zhihui Big Data Research Institute, Hohhot, China

Abstract

The incidence of allergic rhinitis (AR) has been increasing annually, severely impacting patients’ quality of life and increasing socioeconomic burdens. The limitations of current diagnostic methods have made the development of efficient, low-cost early screening tools urgent. Based on routine blood test data, this study employed an ensemble hard voting strategy, a comprehensive filtering strategy, an embedding strategy, and a packing strategy to select 16 highly correlated features with a frequency of at least two occurrences as model inputs. Subsequently, the top three machine learning algorithms (K-nearest neighbor, logistic regression, random forest, decision tree, and support vector machine) were selected based on the area under the curve (AUC) metric as the base classifiers. An intelligent early screening model for AR was constructed using an ensemble soft voting strategy. This model demonstrated superior performance, achieving an AUC of 0.862, significantly outperforming any single algorithm. Furthermore, the external validation accuracy was 73.91%. These results demonstrate that combining an ensemble voting strategy with machine learning methods can effectively construct an early screening model for AR based on routine blood test parameters without adding additional burden to patients, providing a new approach to improving diagnosis and treatment in primary care settings.

Introduction

Allergic rhinitis (AR) is a global disease caused by complex interactions between genetic and environmental factors and affects 400 million people worldwide. Environmental exposure, climate change and lifestyle are all risk factors for AR. For example, pollen in the air increases the prevalence of AR [1,2]. In recent years, the incidence of this disease has risen sharply [3]. In Denmark, the prevalence of AR among the adult population has gradually increased from 19% to 32% over the past three decades [4]. Similarly, the standardized prevalence of AR among adults in China has increased by 6.5% over the past 6 years [5]. The self-reported prevalence of pollen-induced AR has reached an extremely high value of 32.4% in the grassland areas of northern China [6]. The high incidence of AR not only seriously affects the quality of life of patients but also results in a heavy economic burden. Currently, screening for AR mainly relies on a comprehensive evaluation of clinical manifestations, physical signs and allergen detection. However, these traditional methods have certain limitations in large-scale screening. For example, questionnaires rely on the subjective responses of the examinees, which may lead to inaccurate information; skin prick tests (SPTs) [7] are strongly affected by drugs and skin conditions and have the potential for severe allergic reactions; and serum allergen-specific IgE (slgE) tests [8,9] have low sensitivity and high costs, increasing the economic burden on patients. Therefore, there is an urgent need for a new method for AR screening that is objective, accurate, and highly applicable.

Routine blood tests are routine tests with low prices and are suitable for large-scale population screening. Studies have shown that patients with allergic rhinitis have certain manifestations according to routine blood test indicators, especially an increase in eosinophils, which not only reflects the allergic state of the body but also reflects the severity of AR. Therefore, routine blood test data, as an objective examination method suitable for large-scale AR screening, can effectively solve the subjective problem in questionnaire descriptions.

In recent years, with the rapid development of science and technology, artificial intelligence (AI) has demonstrated a profound influence in various fields [10,11], especially in data processing and diagnostic assistance, and its potential has surpassed that of traditional methods. Moreover, China’s medical informationization level is in a stage of rapid development. The volumes of various types of clinical test data are growing rapidly, indicating the necessity of medical reform based on big data technology. However, the existing single machine learning methods have problems such as low stability and limited expression ability when processing complex data, resulting in low accuracy of prediction and decision-making. The integrated voting method [12,13] combines the prediction results of multiple single basic models to obtain more stable prediction results, thereby improving the accuracy and robustness of the model and reducing model bias and variance. Therefore, based on routine blood test data, this paper adopts an integrated hard voting method, comprehensive filtering method, embedding method and wrapping method to screen test indicators strongly correlated with AR. An integrated soft voting method is adopted, combining the top three algorithms among K-nearest neighbors (KNN), logistic regression (LR), random forest (RF), decision tree (DT) and support vector machine (SVM) methods to construct an accurate convenient and universal AR screening model, expand the theoretical method system for objective evaluation of AR, compensate for the defects of existing AR screening methods, reduce the disease and economic burden of AR on the population, and provide a scientific basis for formulating more effective prevention measures and control strategies.

Materials and methods

Data source

This study included AR patients who visited the outpatient department of the First Hospital of Hohhot from March 21, 2023 to September 3, 2023. Specifically, the data from 26 routine tests (Table 1) and the AR diagnosis results of patients aged 18–70 years were included. The study was approved by the Ethics Committee of Hohhot First Hospital(Approval No: [IRB2022018]). All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study. Prior to data collection, the purpose of the study, potential risks and benefits, and the confidentiality of their information were explained in detail to all participants. To ensure privacy and confidentiality, all personally identifiable information (PII) of participants was removed during data preprocessing. This included, but was not limited to, names, identification numbers, and contact information. Each patient was assigned a unique, non-identifiable code for data analysis. The dataset used for model development and validation was fully anonymized. The subjects were divided into an AR group and a non-AR group. The inclusion criteria for the AR group were as follows: (1) the main symptoms of AR were nasal congestion, nasal itching, runny nose, or sneezing, including two or more symptoms; (2) the allergen skin prick test was positive at 3–5 mm, and the allergens included Artemisia grandis, Artemisia annua, Humulus, Chenopodium album, corn, sunflower, poplar, willow, elm, cypress, birch, house dust mite, dust mite, Alternaria alternata, cat hair, dog hair, ragweed, and Artemisia sphagnum. The inclusion criteria for the non-AR group were as follows: (1) the history of AR was denied; and (2) the allergic skin prick test was negative. According to the above inclusion criteria, those whose clinical data were incomplete and affected the judgment were excluded. Eliminate sample data with missing values or outliers. Finally, routine examination data and AR diagnosis results were collected for 1295 subjects, including 676 AR patients and 619 non-AR patients.

thumbnail
Table 1. Statistical results of blood routine test data based on SPSS.

https://doi.org/10.1371/journal.pone.0337561.t001

External validation set

This study used a single-center prospectively recruited external validation cohort. The data came from outpatients in the Allergy Department of Hohhot First Hospital from March 21 to September 3, 2023. After screening and meeting the inclusion criteria, a total of 184 patients aged 19–70 years (49 males and 135 females) were included; according to clinical gold standards (such as allergen sIgE testing and typical symptoms and signs), they were divided into 73 allergic rhinitis (AR) patients and 111 non-AR control groups.The comprehensive model was employed to predict the probability of AR for each patient. A probability score ≥ 0.5 was regarded as positive, while a score < 0.5 was classified as negative. The model’s predictions were subsequently compared against the actual clinical diagnoses to evaluate the screening performance of the integrated voting method.

Statistical analysis

Before model construction, to optimize the effect of feature screening, this study used SPSS software for statistical analysis based on 28-dimensional features (26 routine blood test indicators and age and gender). The statistical methods selected were as follows: (1) Qualitative variables were tested using the chi-square test and expressed as categories (percentages); (2) Quantitative variables were first tested for normality. If the variables were normal (Shapiro-Wilk test p > 0.05) or approximately normal distribution (standard deviation < mean/3), the independent sample t-test (mean ± standard deviation) was used, and significance was determined by the Levene test (Student’s t-test was used when p > 0.05, otherwise the Welch’s t-test was used); (3) Non-normal or approximately normal distribution variables were tested using the Mann-Whitney U test and expressed as median (upper quartile, lower quartile). Specific feature abbreviations, definitions, and statistical results are shown in Table 1.

Routine blood feature screening based on integrated hard voting

In terms of feature selection, filtering, embedding and wrapping methods each have their own advantages: filtering uses statistical indicators (such as chi-square test and mutual information) to perform preliminary feature screening independently of the model, avoiding model assumption bias and being suitable for high-dimensional data preprocessing [14]; embedding embeds feature selection into the model training process (such as LASSO or feature importance of tree models), automatically eliminates redundant features through regularization or integration mechanisms, and is particularly suitable for processing nonlinear relationship features [15,16]; wrapping is guided by model performance (such as recursive feature elimination, RFE), and captures feature interaction effects through iterative search, and performs outstandingly in scenarios such as medical diagnosis that require fine optimization of feature combinations [17]. Based on the above advantages, this study selected these three methods for feature screening experiments.

This study proposes a robust feature selection method based on an integrated voting strategy, which constructs an optimal feature subset by integrating the advantages of filtering, embedding, and wrapping. The specific implementation is divided into two stages: the first stage adopts multi-strategy parallel screening, in which the filtering method selects the top 15 important features based on mutual information [18], the embedding method selects non-zero coefficient features through LASSO regression [19,20], and the wrapping rule uses recursive feature elimination (RFE) combined with a random forest model to retain the top 15 important features [2123]. The second stage integrates the three sets of candidate features through a hard voting mechanism, and determines the features with a screening frequency ≥ 2 as the final feature subset. This method innovatively combines the model independence of the filtering method, the regularization constraint of the embedding method, and the performance-oriented characteristics of the wrapping method. Through the majority voting mechanism, it effectively improves the stability and generalization ability of feature selection, and is particularly suitable for ensemble learning scenarios where the prediction performance of the base models is similar but there are differences [24].

This study uniformly uses support vector machines (SVMs) as classification models. The ability of its kernel function to process high-dimensional feature spaces effectively avoids the dimensionality curse problem and is particularly suitable for verifying feature subsets after filtering and embedding methods. Based on the principle of structural risk minimization, SVM gives the model natural robustness to feature redundancy and noise, and can objectively evaluate the changes in the generalization performance of the wrapping method during iterative feature elimination. In order to comprehensively evaluate the effect of feature selection, indicators such as the area under the ROC curve (AUC), accuracy (ACC), specificity (SPE), and sensitivity (SEN) [25] are used to systematically compare the performance of filtering, embedding, wrapping, and ensemble voting methods in feature selection.

Construction of a screening model based on integrated soft voting

Based on the blood routine blood test data after feature selection, KNN [2628], LR [2931], RF [3234], DT [35,36] and SVM [3739] methods were used. Five machine learning algorithms were used to construct an AR early screening model. These five machine learning algorithms have the following characteristics for binary classification: KNN (K-Nearest Neighbor) calculates the distance between samples and uses majority voting among neighbors for classification. It is suitable for complex data distributions and sensitive to local patterns; LR (Logistic Regression) uses the sigmoid function to achieve probability mapping and has excellent interpretability; RF (Random Forest) uses bootstrap sampling to construct multiple decision trees and ensemble voting, effectively handling feature interactions; DT (Decision Tree) recursively partitions the feature space based on information gain, generating interpretable tree-like decision rules; SVM (Support Vector Machine) constructs an optimal hyperplane by maximizing the classification margin. Its kernel function can handle nonlinear problems and is robust to small sample sizes. These algorithms provide different solutions to classification problems from different perspectives. The AUC value is used as the evaluation indicator to compare the five machine learning algorithms, and the three methods with the highest AUC values are selected to build an integrated model.

This study used an 8:2 split ratio to divide the experimental data, assigning 1036 samples to the training set and 259 samples to the test set. During the splitting process, stratified sampling was used to maintain the positive/negative ratios in the training and test sets consistent with the original datasets to ensure a balanced data distribution. Finally, all feature data were normalized (Z-score) to eliminate the impact of dimensional differences between different detection metrics on the model. The Z-score standardization calculation formula is as follows (1):

(1)

Where represents the original data value, represents the mean of the data set, and represents the standard deviation of the data set.

This study uses four evaluation indicators to comprehensively evaluate model performance: area under the receiver operating characteristic curve (AUC), accuracy, recall (also known as sensitivity), and specificity. The calculation formulas for each indicator are as follows (2–4):

(2)(3)(4)

In the evaluation of medical diagnostic models, each indicator has clear clinical significance: AUC is a comprehensive indicator of the overall discriminative efficiency of the model, and its value range of 0.5–1.0 reflects the model’s ability to distinguish between disease and non-disease states. The closer the value is to 1, the better the discriminative efficiency; Accuracy represents the overall proportion of correct predictions of all samples by the model; Recall specifically refers to the proportion of correctly identified samples in actual positive samples, reflecting the disease detection ability; Specificity indicates the proportion of correctly excluded samples in actual negative samples, reflecting the accuracy of disease exclusion. True positive (TP, actually positive and correctly predicted), true negative (TN, actually negative and correctly predicted), false positive (FP, actually negative but incorrectly predicted as positive) and false negative (FN, actually positive but incorrectly predicted as negative), these basic indicators constitute the confusion matrix for model performance evaluation.

The integrated model uses the soft voting method [40], taking the AUC value of each method as the weight and calculating the weighted average of the prediction probabilities of the three methods as the final prediction probability of AR, that is:

(5)

Where p is the final predicted probability and where and are the AUC and predicted probability, respectively, of the different models. ROC curves and AUC values were used to evaluate the performance of five single models and integrated models, including KNN, LR, RF, DT and SVM, in the early screening of AR to verify the superiority of the integrated voting method. The overall experimental process is shown in Fig 1 below.

thumbnail
Fig 1. Flowchart of the establishment of an AR early screening model via the integrated voting method.

https://doi.org/10.1371/journal.pone.0337561.g001

The core theoretical foundation of the SHAP (SHapley Additive exPlanations) method stems from game theory. Its core goal is to fairly distribute the model’s prediction results to each input feature, thereby quantifying the contribution of each feature to the prediction. In a SHAP visualization chart, the Y-axis is arranged in descending order of feature importance, with the top feature representing the feature with the greatest impact on the model’s overall output. Specifically, the X-axis of the summary bar chart represents the average SHAP absolute value of the feature; a larger value indicates a greater average change in the model’s prediction results. The X-axis of the beeswarm plot represents the SHAP value, where a center position indicates that the feature has no impact on the prediction result, a rightward shift indicates a positive impact, and a leftward shift indicates a negative impact. The colors in the chart correspond to the actual value of the feature in the sample, with red representing a higher value and blue representing a lower value, providing a visual representation of the correlation between feature values and the direction of influence.

Results

Feature selection results

After multiple adjustments and optimizations, the final feature selection method and parameter settings are as follows: The filtering method (mutual information) directly calculates the correlation between feature data and labels without additional parameters, and selects the top 15 features as the screening results; the embedding method (LASSO) is built based on the logistic regression model, with the parameter configuration of penalty = ‘l1’, C=0.1, olver=’liblinear’, and class_weight = ’balanced’. The screening rule is to select features with the absolute value of the feature coefficient greater than 0; the RFE method uses the random forest model as the basic model. After parameter optimization, the random forest model parameters are set to n_estimators = 10 and the class weight is ‘balanced’; RFE’s n_features_to_select = 15 (retaining 15 important features) and step = 1 (eliminating 1 feature at each iteration).

Table 2 lists the feature variables selected by the three feature selection methods and their contribution rankings (sorted by absolute value). The intersection of the features of the three methods was analyzed using a Venn diagram(Fig 2), and a hard voting method was used to determine 16 core features as input features for subsequent models: age (Age), sex (Sex), EO#, EO%, HCT, HGB, MCHC, MCV, MONO%, NEUT%, PDW, P-LCR, PLT, RBC, RDW-CV, and RDW-SD.

thumbnail
Table 2. Feature screening results based on filtering, wrapping, and embedding methods.

https://doi.org/10.1371/journal.pone.0337561.t002

thumbnail
Fig 2. Venn diagram of three sets of characteristic variables.

https://doi.org/10.1371/journal.pone.0337561.g002

To verify the impact of different feature selection methods on classification model performance, this study analyzed five feature sets: the initial 28-dimensional raw features, feature subsets selected using Mutual Information, RF-REF, and LASSO, and a set of 16 core features determined through hard voting. After constructing a classification model based on the support vector machine (SVM) algorithm, model performance was evaluated using the following metrics: AUC, Accuracy, Recall, and Specificity. The classification performance results for different feature sets on the SVM model are shown in Table 3. The analysis results show that the feature selection method based on ensemble hard voting combines the advantages of the three individual methods, resulting in the best classification model across all evaluation metrics (see Fig 3). Specifically, the core features selected using ensemble hard voting achieved the highest AUC (0.845), ACC (77.61%), Recall (75.56%), and Specificity (79.84%) in the SVM model. In summary, ensemble voting is a reasonable and effective feature selection method.

thumbnail
Table 3. Comparison of classification performance of SVM Models on different feature sets.

https://doi.org/10.1371/journal.pone.0337561.t003

thumbnail
Fig 3. Comparison of classification performance of SVM models on different feature sets.

https://doi.org/10.1371/journal.pone.0337561.g003

Algorithm selection results

The model training process uses grid search technology to tune parameters and uses accuracy as the evaluation indicator to select the best parameter combination. Table 4 shows the model parameter settings, and the threshold parameters are all set to 0.5 by default.

We used the validation set to evaluate the performance of each model. The results revealed that the AUCs of the SVM, RF, and LR models were greater than 0.849, 0.836 and 0.827, respectively. To further improve the screening effect, the integrated soft voting method was used to calculate the final prediction probability based on the prediction probabilities of the SVM, RF, and LR models. The calculation formula is shown in Formula (6).

(6)

where p is the prediction probability of ensemble soft voting and where , , and are the prediction probabilities of the LR, RF, and SVM models, respectively. An example of an ensemble soft voting calculation is shown in Table 5.

The performance of each model was evaluated using a validation set. The evaluation metrics for the five machine learning methods under optimal parameters are shown in Table 6. The results show that the SVM, RF, and LR models achieved AUCs of 0.849, 0.836, and 0.827, respectively, and achieved the best performance in terms of specificity (0.815, 0.782, and 0.742), precision (0.788, 0.768, and 0.768), and recall (0.763, 0.756, and 0.793). To further improve performance, the prediction probabilities of SVM, RF, and LR were combined using an ensemble soft voting method to calculate the final result. Table 6 shows that the ensemble soft voting method achieved the highest AUC (0.862) and specificity (0.855). Its AUC and specificity were slightly higher than those of the SVM by 0.013 and 0.04, respectively. Only its accuracy (0.772) and recall (0.696) were slightly lower than those of the SVM, by 0.016 and 0.067, respectively. Overall, the ensemble soft voting method performed better. Fig 4 shows the ROC curves for different AR early screening methods, confirming this: the ensemble soft voting method achieved an AUC of 0.862, exceeding that of the other single models. Therefore, the ensemble soft voting method was ultimately chosen to combine the LR, RF, and SVM models to construct the AR screening model.

thumbnail
Fig 4. Comparison of ROC curves for predicting AR via five machine learning algorithms and ensemble voting.

https://doi.org/10.1371/journal.pone.0337561.g004

Fig 5 shows the SHAP visualization results of the three best models, namely support vector machine (SVM), random forest (RF), and logistic regression (LR). The Support Vector Machine (SVM) model assigned the highest importance to RDW-SD, PDW, sex, P-LCR, and EO% (Fig 5A, 5B). RDW-SD (Red Blood Cell Distribution Width – Standard Deviation): This was identified as the most important feature in the SVM model. RDW-SD quantifies the heterogeneity in red blood cell size (anisocytosis). While not a traditional marker for AR, elevated RDW is increasingly recognized as a surrogate marker of systemic inflammation. The chronic inflammatory state in AR could potentially influence erythropoiesis, leading to greater variation in red cell size. In our analysis, higher RDW-SD values (red) correlated with increased SHAP values, suggesting a positive contribution to an AR prediction. PDW (Platelet Distribution Width) & P-LCR (Platelet Large Cell Ratio): These parameters measure the variability in platelet size. Larger, more reactive platelets are often associated with inflammatory and immune-mediated conditions. Platelets are known to participate in allergic inflammation by releasing mediators and interacting with other immune cells. The importance of PDW and P-LCR in our model suggests a potential link between platelet activation and AR. The SHAP dependence plots show that elevated levels of these markers (red) contribute to a higher probability of AR. The inclusion of sex as a key feature is supported by well-documented epidemiological evidence. The prevalence and severity of allergic diseases, including AR, often exhibit significant differences between males and females, likely due to a complex interplay of hormonal, genetic, and environmental factors. Our model has captured this effect, with the specific SHAP value for each patient indicating the direction and magnitude of sex’s contribution to their individual prediction. The Random Forest (RF) model identified RDW-SD and EO# among its most influential features, followed by EO% (Fig 5C, 5D). The reaffirmation of eosinophil-related parameters (EO#, EO%) across all models underscores their critical and non-redundant role in AR screening. The concomitant importance of RDW-SD across all three models suggests a previously underappreciated link between erythrocyte indices and allergic inflammation that warrants further investigation; In the Logistic Regression (LR) model, the top contributing features were RDW-SD, RDW-CV, and PDW(Fig 5E, 5F). Elevated red cell distribution width (RDW) values, indicating heterogeneity in red blood cell size (anisocytosis), have been increasingly associated with systemic inflammatory states. In the context of AR, a chronic inflammatory condition, elevated RDW may serve as a surrogate marker for underlying inflammatory processes. Similarly, Mean Platelet Volume (MPV) is a marker of platelet activation, which can be modulated by allergic and inflammatory responses.

thumbnail
Fig 5. SHAP interpretability analysis of the three best models: support vector machine (A, B), random forest (C, D), and logistic regression (E, F).

https://doi.org/10.1371/journal.pone.0337561.g005

In summary, the feature contribution analysis confirms that our ensemble model successfully leverages features with strong biological and clinical foundations. The prominence of eosinophil counts (EO#, EO%) directly corroborates the pathophysiological mechanism of AR, thereby enhancing the clinical interpretability and credibility of our intelligent screening tool. The recurring importance of RDW-related features points to a potential novel hematological dimension in AR that merits future research.

External validation

The performance evaluation results of the AR early screening model based on the ensemble voting method on the external validation set are shown in Table 7. Compared with the actual clinical diagnosis results, the model showed good screening effect, with an accuracy of 0.739, an AUC of 0.722, and a high specificity (0.829). It was reliable in excluding non-cases, with a recall rate of 0.603 and a medium sensitivity, which can meet the basic clinical needs of early screening. The confusion matrix of the external validation set is shown in Fig 6, which contains 184 samples. The model predicted 92 true negatives (TN), 19 false positives (FP), 29 false negatives (FN), and 44 true positives (TP). The total number of correctly predicted samples was 136, corresponding to an accuracy of 0.739.

thumbnail
Table 7. Evaluation metrics of the external validation set.

https://doi.org/10.1371/journal.pone.0337561.t007

thumbnail
Fig 6. External validation set performance evaluation results: (A) confusion matrix and (B) classification performance index analysis.

https://doi.org/10.1371/journal.pone.0337561.g006

Discussion

In recent years, the prevalence of AR has increased annually, and 37% of allergic people gradually develop allergic asthma within 5 years or rapidly develop allergic asthma in extreme weather conditions (such as thunderstorms). Some patients may even experience anaphylactic shock due to the consumption of plant-derived foods related to allergic pollen [41]. The AR treatment guidelines propose the principle of “combining prevention and treatment, four-in-one” treatment; this includes secondary prevention, encompassing early detection, early diagnosis and early treatment to avoid or reduce the latent period during which allergic patients develop into AR patients [42]. Currently, the existing AR screening methods rely mainly on patients’ symptom descriptions and allergen detection. However, the subjective descriptions of patients have a certain degree of ambiguity; allergen detection methods are limited in small- and medium-sized hospitals; and the cost of instruments, equipment, reagents and consumables is high, which increases the medical burden on patients. These challenges have led to slow progress in AR prevention in clinical practice. Therefore, it is crucial to establish a simple, effective, rapid and widely applicable AR early screening model, especially for AR auxiliary diagnosis in primary medical institutions, to standardize and guide clinicians in disease diagnosis and treatment more quickly.

Routine blood examination is a routine examination method in clinical medicine that can reveal the physiological and pathological status of the human body. Some studies have noted that routine blood and blood lipid biochemical indicators can assist doctors in diagnosing AR. The relationship between AR and blood cell components has gradually become a research hotspot. The literature [43] indicates that the blood eosinophil count and ratio may be good predictors of AR, especially AR accompanied by chronic sinusitis. The literature [44] indicates that the average platelet volume and platelet distribution width in children with AR are lower than those in healthy children, whereas the platelet/average platelet volume ratio is greater than that in healthy children. This study uses the integrated voting method to select 16 items from the original 28-dimensional features of the sample: RDW-SD, EO%, age, RDW-CV, MCHC, PDW, NEUT%, EO#, RBC, sex, HCT, MONO%, P-LCR, MCV, HGB, and PLT. A classification model is built on the basis of the SVM algorithm to evaluate the impact of feature selection on the model. The performance from best to worst is as follows: integrated voting feature selection > single method feature selection > original features. In conclusion, the feature selection method proposed in this study not only significantly improves the accuracy and efficiency of AR screening but also opens a new path for the application of routine blood indicators in the diagnosis and treatment of allergic diseases. This discovery not only strengthens routine blood examination at the core position of AR diagnosis but also provides solid theoretical support and an empirical basis for clinical practice, encourages medical workers to explore the potential value of routine blood indices in early recognition of AR, condition monitoring and personalized treatment planning, and promotes refinement in the field of allergic rhinitis diagnosis and personalized development.

Machine learning technology can effectively process high-dimensional data and mine its inherent complex laws, so it has broad application prospects in the medical field. In recent years, researchers have begun to explore the application of machine learning technology for AR diagnosis. Christo et al. [45] used genetic algorithms and extreme learning machines to train features to achieve intelligent diagnosis of AR with an accuracy rate as high as 97.7%. Kavya et al. [46] used the random forest algorithm to construct a clinical decision support system for allergic diseases. The accuracy of the system was 86.39%, and the sensitivity for the combined rhinitis-urticaria category was 75%. However, existing methods have certain limitations in model construction, such as the use of a single algorithm for processing or the use of only symptoms and consultation information as inputs for analysis, resulting in subjective bias and instability in the prediction results. This study employed the integrated voting method to analyze the results of the KNN, LR, RF, DT, and SVM models and selected the top three AUC algorithms for integrated soft voting, aiming to obtain a more stable AR prediction probability. The results show that the AUCs of the integrated soft voting method are 3.5%, 2.6%, and 1.3% higher than those of the LR, RF, and SVM methods, respectively. In addition, this study used external verification data to verify the model, and the accuracy rate reached 73.91%. This finding shows that the integrated voting method can significantly improve the accuracy and stability of the model, providing a more accurate and reliable AR-assisted diagnostic tool for clinical use. In conclusion, this study, through the secondary development of nonspecific laboratory routine testing data, builds a model for grassroots medical institutions and early AR screening for grassroots medical service provision. Specifically, it opens a new opportunity, in the absence of advanced screening equipment and technical experts, for the performance of AR screening work without additional medical burdens, effectively filling the primary medical gaps in the field of allergic disease diagnosis and treatment. The successful implementation of this study provides a useful model for the early screening of similar diseases, which is expected to stimulate more medical innovations based on big data and further promote the overall upgrading and transformation of China’s medical system.

Our model achieved an accuracy of 73.91% upon external validation, which is lower than the 97.7% accuracy reported by Christo et al. [45] using a genetic algorithm and extreme learning machine. This discrepancy can be primarily attributed to fundamental methodological differences. The study by Christo et al. likely utilized a feature set that included specific allergen test results or detailed symptom profiles, which are highly predictive but less universally available. In contrast, our model relies solely on routine blood test data, a more accessible and cost-effective but also more general and nonspecific data source. This strategic choice inherently trades off some predictive power for significantly greater applicability in primary care and resource-limited settings where advanced allergen testing is not feasible. Therefore, our model is not intended to replace specialized diagnostic tests but rather to serve as a widespread, first-line screening tool. Similarly, compared to the model by Kavya et al. [46] (accuracy: 86.39%), which incorporated clinical symptom data, our blood-based model offers the advantage of objectivity, minimizing the subjectivity associated with patient-self-reported symptoms.

A notable limitation of this study is that both the model development and external validation cohorts were exclusively recruited from a single center in Hohhot. This geographical specificity may introduce bias related to the local population’s genetic predispositions, predominant allergens (e.g., specific pollen types prevalent in the Inner Mongolia grassland region [6]), and regional environmental factors. Consequently, the generalizability of our screening model to populations in other geographical areas with different demographic characteristics and allergen exposures may be limited. While the internal and single-center external validation demonstrated promising performance, future validation on multi-center, geographically diverse cohorts is essential to confirm the robustness and widespread applicability of our model across various populations. In addition, this paper uses only routine blood test data when establishing the model and fails to integrate information such as contact history and symptoms of potential risk factors for AR, which reduces the accuracy and interpretability of the model. To better adapt to real clinical scenarios and improve the accuracy of decision-making, future work will aim to expand the data source and increase the information dimension of the model input.

This study utilized routine blood test data from a single time point, which does not account for the dynamic physiological changes inherent to allergic conditions. For instance, eosinophil counts (%)—a key feature in our model—are known to fluctuate with allergen exposure intensity, such as seasonal pollen variations [6,43]. A model trained on static measurements may therefore be less reliable for patients whose blood parameters are in a transient state. This limitation highlights a valuable direction for future research. Subsequent studies could implement longitudinal tracking of blood parameters in AR patients across different seasons. Developing a model that incorporates temporal patterns of change or that recommends diagnosis based on tests performed during symptomatic periods could significantly enhance predictive accuracy and clinical utility for seasonal AR sufferers.

Supporting information

S1 Table. Statistical results of routine blood test data of the external validation set based on SPSS.

All features of the external validation set (minimal dataset) were statistically analyzed using SPSS. Variables with a normal distribution (including near-normal distributions) were presented as mean ± standard deviation, while non-normally distributed variables were presented as quartiles. The statistical results are shown in Table 1. Using the ensemble voting method, this study ultimately identified 16 input features. Consequently, the statistical analysis of the external validation set included only these 16 features and was conducted using SPSS software.

https://doi.org/10.1371/journal.pone.0337561.s001

(PDF)

Acknowledgments

This study was supported by the Inner Mongolia Intelligent Big Data Research Institute team, whose valuable advice and support greatly contributed to the experimental guidance and manuscript revisions. We also extend our gratitude to colleagues from the Department of Allergy at Hohhot First Hospital for their assistance in data collection and analysis. Finally, we express our heartfelt thanks to our family and friends for their unwavering encouragement and understanding throughout this research endeavor.

References

  1. 1. Höflich C, Balakirski G, Hajdu Z, Baron JM, Fietkau K, Merk HF, et al. Management of patients with seasonal allergic rhinitis: Diagnostic consideration of sensitization to non-frequent pollen allergens. Clin Transl Allergy. 2021;11(8):e12058. pmid:34631010
  2. 2. Sakashita M, Tsutsumiuchi T, Kubo S, Tokunaga T, Takabayashi T, Imoto Y, et al. Comparison of sensitization and prevalence of Japanese cedar pollen and mite-induced perennial allergic rhinitis between 2006 and 2016 in hospital workers in Japan. Allergol Int. 2021;70(1):89–95. pmid:32800742
  3. 3. Zhang Y, Lan F, Zhang L. Advances and highlights in allergic rhinitis. Allergy. 2021;76(11):3383–9.
  4. 4. Leth-Møller KB, Skaaby T, Linneberg A. Allergic rhinitis and allergic sensitisation are still increasing among Danish adults. Allergy. 2020;75(3):660–8. pmid:31512253
  5. 5. Wang XD, Zheng M, Lou HF, Wang CS, Zhang Y, Bo MY, et al. An increased prevalence of self-reported allergic rhinitis in major Chinese cities from 2005 to 2011. Allergy. 2016;71(8):1170–80. pmid:26948849
  6. 6. Ma T, Wang X, Zhuang Y, Shi H, Ning H, Lan T, et al. Prevalence and risk factors for allergic rhinitis in adults and children living in different grassland regions of Inner Mongolia. Allergy. 2020;75(1):234–9. pmid:31169905
  7. 7. Bousquet J, Heinzerling L, Bachert C, Papadopoulos NG, Bousquet PJ, Burney PG, et al. Practical guide to skin prick tests in allergy to aeroallergens. Allergy. 2012;67(1):18–24. pmid:22050279
  8. 8. Shamji MH, Valenta R, Jardetzky T, Verhasselt V, Durham SR, Würtzen PA, et al. The role of allergen-specific IgE, IgG and IgA in allergic disease. Allergy. 2021;76(12):3627–41. pmid:33999439
  9. 9. Traiyan S, Manuyakorn W, Kanchongkittiphon W, Sasisakulporn C, Jotikasthira W, Kiewngam P, et al. Skin Prick Test Versus Phadiatop as a Tool for Diagnosis of Allergic Rhinitis in Children. Am J Rhinol Allergy. 2021;35(1):98–106. pmid:32597210
  10. 10. Zhang C, Lu Y. Study on artificial intelligence: The state of the art and future prospects. Journal of Industrial Information Integration. 2021;23:100224.
  11. 11. Alowais SA, Alghamdi SS, Alsuhebany N, Alqahtani T, Alshaya AI, Almohareb SN, et al. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med Educ. 2023;23(1):689. pmid:37740191
  12. 12. Ali S, Hussain A, Aich S, Park MS, Chung MP, Jeong SH, et al. A Soft Voting Ensemble-Based Model for the Early Prediction of Idiopathic Pulmonary Fibrosis (IPF) Disease Severity in Lungs Disease Patients. Life (Basel). 2021;11(10):1092. pmid:34685461
  13. 13. Osamor VC, Okezie AF. Enhancing the weighted voting ensemble algorithm for tuberculosis predictive diagnosis. Sci Rep. 2021;11(1):14806. pmid:34285324
  14. 14. Zhang M-L, Peña JM, Robles V. Feature selection for multi-label naive Bayes classification. Information Sciences. 2009;179(19):3218–29.
  15. 15. Liu H, Zhou M, Liu Q. An embedded feature selection method for imbalanced data classification. IEEE/CAA J Autom Sinica. 2019;6(3):703–15.
  16. 16. Mistry K, Zhang L, Neoh SC, Lim CP, Fielding B. A Micro-GA Embedded PSO Feature Selection Approach to Intelligent Facial Emotion Recognition. IEEE Trans Cybern. 2017;47(6):1496–509. pmid:28113688
  17. 17. John GH, Kohavi R, Pfleger K. Irrelevant Features and the Subset Selection Problem. Machine Learning Proceedings 1994. Elsevier. 1994. 121–9. https://doi.org/10.1016/b978-1-55860-335-6.50023-4
  18. 18. Huang J, Cai Y, Xu X. A Filter Approach to Feature Selection Based on Mutual Information. In: 2006 5th IEEE International Conference on Cognitive Informatics, 2006. 84–9. https://doi.org/10.1109/coginf.2006.365681
  19. 19. Tibshirani R. Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology. 1996;58(1):267–88.
  20. 20. Muthukrishnan R, Rohini R. LASSO: A feature selection technique in predictive modeling for machine learning. In: 2016 IEEE International Conference on Advances in Computer Applications (ICACA), 2016. 18–20. https://doi.org/10.1109/icaca.2016.7887916
  21. 21. Liu XX. Study on the Remote Sensing Feature Selection Method for Forest Biomass Estimation Based on RF-RF. Shandong Agricultural University. 2017.
  22. 22. Wei XM, Xu B, Guan JH. Prediction of protein energy hot spots based on recursion feature elimination. Journal of Shandong University (Engineering Edition). 2014;44(02):12–20.
  23. 23. Senan EM, Al-Adhaileh MH, Alsaade FW, Aldhyani THH, Alqarni AA, Alsharif N, et al. Diagnosis of Chronic Kidney Disease Using Effective Classification Algorithms and Recursive Feature Elimination Techniques. J Healthc Eng. 2021;2021:1004767. pmid:34211680
  24. 24. Khairy R, Hussein A, ALRikabi H. The Detection of Counterfeit Banknotes Using Ensemble Learning Techniques of AdaBoost and Voting. IJIES. 2021;14(1):326–39.
  25. 25. Muntean M, Militaru F-D. Metrics for Evaluating Classification Algorithms. Smart Innovation, Systems and Technologies. Springer Nature Singapore. 2023. 307–17. https://doi.org/10.1007/978-981-19-6755-9_24
  26. 26. Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inform Theory. 1967;13(1):21–7.
  27. 27. Sinha P, Sinha P. Comparative Study of Chronic Kidney Disease Prediction using KNN and SVM. IJERT. 2015;V4(12).
  28. 28. Uddin S, Haque I, Lu H, Moni MA, Gide E. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Sci Rep. 2022;12(1):6256. pmid:35428863
  29. 29. Hosmer DW, Lemeshow S, Sturdivant RX. Applied logistic regression. John Wiley & Sons. 2013.
  30. 30. Nusinovici S, Tham YC, Chak Yan MY, Wei Ting DS, Li J, Sabanayagam C, et al. Logistic regression was as good as machine learning for predicting major chronic diseases. J Clin Epidemiol. 2020;122:56–69. pmid:32169597
  31. 31. Schober P, Vetter TR. Logistic Regression in Medical Research. Anesth Analg. 2021;132(2):365–6. pmid:33449558
  32. 32. Breiman L. Random Forests. Machine Learning. 2001;45(1):5–32.
  33. 33. Asadi S, Roshan S, Kattan MW. Random forest swarm optimization-based for heart diseases diagnosis. J Biomed Inform. 2021;115:103690. pmid:33540075
  34. 34. Sarica A, Cerasa A, Quattrone A. Random Forest Algorithm for the Classification of Neuroimaging Data in Alzheimer’s Disease: A Systematic Review. Front Aging Neurosci. 2017;9:329. pmid:29056906
  35. 35. Rokach L, Maimon O. Decision trees, Data mining and knowledge discovery handbook, 2005: 165–192. https://doi.org/10.1002/wics.1278
  36. 36. Ghiasi MM, Zendehboudi S. Application of decision tree-based ensemble learning in the classification of breast cancer. Comput Biol Med. 2021;128:104089. pmid:33338982
  37. 37. Joachims T. Making large-scale SVM learning practical. 1998. http://hdl.handle.net/10419/77178
  38. 38. Shankar K, Lakshmanaprabu SK, Gupta D, Maseleno A, de Albuquerque VHC. RETRACTED ARTICLE: Optimal feature-based multi-kernel SVM approach for thyroid disease classification. J Supercomput. 2020;76(2):1128–43.
  39. 39. Balasubramaniam V. Artificial Intelligence Algorithm with SVM Classification using Dermascopic Images for Melanoma Diagnosis. JAICN. 2021;3(1):34–42.
  40. 40. Littlestone N, Warmuth MK. The Weighted Majority Algorithm. Information and Computation. 1994;108(2):212–61.
  41. 41. Yin J, Yue FM, Wang LL. Clinical study of the progression of allergic rhinitis to allergic asthma in hay fever patients in summer and autumn. Chinese Journal of Medicine. 23(2006):1628–32.
  42. 42. Wang HT, Wang XY. Attention should be paid to environmental control and the strengthening of health education in the prevention and control of allergic rhinitis. The Chinese Journal of Preventive Medicine. 2023;57(03):318–26.
  43. 43. Luo Q, Zhou S, Yuan B, Feng Z, Tan G, Liu H. Blood eosinophil count in the diagnosis of allergic-like rhinitis with chronic rhinosinusitis. Clin Otolaryngol. 2023;48(2):339–46. pmid:36222453
  44. 44. Topal E, Celiksoy M, Demir F, Bag H, Sancak R. Is the change in platelet parameters in children with allergic rhinitis an indicator of allergic inflammation?. Med-Science. 2018;1.
  45. 45. Christo VRE, Nehemiah HK, Nahato KB, Brighty J, Kannan A. Computer assisted medical decision-making system using genetic algorithm and extreme learning machine for diagnosing allergic rhinitis. IJBIC. 2020;16(3):148.
  46. 46. Kavya R, Christopher J, Panda S, Lazarus YB. Machine Learning and XAI approaches for Allergy Diagnosis. Biomedical Signal Processing and Control. 2021;69:102681.