Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Diagnostic performance of discriminant formulas and machine learning models for detecting β-thalassemia trait in Bangladesh

  • Rumana Mahtarin,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh

  • Kasrina Azad,

    Roles Investigation, Methodology, Project administration, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Rakib Bin Mahbub Talukder ,

    Contributed equally to this work with: Rakib Bin Mahbub Talukder, Rynak Rahmat, Suzana Chowdhury Nitu

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Electronics and Communication Engineering (ECE), Khulna University of Engineering and Technology (KUET), Khulna, Bangladesh

  • Rynak Rahmat ,

    Contributed equally to this work with: Rakib Bin Mahbub Talukder, Rynak Rahmat, Suzana Chowdhury Nitu

    Roles Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Suzana Chowdhury Nitu ,

    Contributed equally to this work with: Rakib Bin Mahbub Talukder, Rynak Rahmat, Suzana Chowdhury Nitu

    Roles Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Arif Mahmud Howlader,

    Roles Investigation, Methodology, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Mohabbat Hossain,

    Roles Investigation, Methodology, Writing – review & editing

    Affiliations Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh, Department of Genetic Engineering and Biotechnology, University of Chittagong, Chittagong, Bangladesh

  • Mst. Sharmin Aktar Mukta,

    Roles Formal analysis, Investigation, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Mohammad Tanbir Habib,

    Roles Investigation, Software, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Abu Bakar Siddik,

    Roles Investigation, Software, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Nishat Sultana,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Zannat Kawser,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Umme Kulsum,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Faisal Zainal Abedin,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Department of Microbiology, Popular Medical College, Dhaka, Bangladesh

  • Nusrat Sultana,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliations Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh, Department of Virology, Dhaka Medical College, Dhaka, Bangladesh

  • Md. Ahashan Habib,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Directorate General of Health Services, Dhaka, Bangladesh

  • A. K. M. Ekramul Hossain,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Department of Project Development, Bangladesh Thalassemia Samity and Hospital, Dhaka, Bangladesh

  • Farjana Akther Noor,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Molecular Division, OMC Healthcare (Pvt.) Ltd, Rupnagar, Dhaka, Bangladesh

  • Ahmad Zubair Mahdi,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Department of Community Medicine and Public Health, Tairunnessa Memorial Medical College and Hospital, Gazipur, Bangladesh

  • Muhammad Asaduzzaman,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Upazila Health Complex, Matlab (North), Chandpur, Bangladesh

  • Emran Kabir Chowdhury,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh

  • Md Rofiqur Rahman,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh

  • Firdausi Qadri,

    Roles Conceptualization, Investigation, Project administration, Resources, Visualization, Writing – review & editing

    Affiliations Institute for Developing Science and Health Initiatives (ideSHi), Dhaka, Bangladesh, Mucosal Immunology and Vaccinology, Infectious Diseases Division, International Centre for Diarrhoeal Disease Research, Mohakhali, Dhaka, Bangladesh

  • Mst. Noorjahan Begum,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Virology Laboratory, Infectious Diseases Division, International Centre for Diarrhoeal Disease Research, Mohakhali, Dhaka, Bangladesh

  •  [ ... ],
  • A. H. M. Nurun Nabi

    Roles Conceptualization, Investigation, Supervision, Visualization, Writing – review & editing

    nabi@du.ac.bd

    Affiliation Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh

  • [ view all ]
  • [ view less ]

Abstract

Background

β-thalassemia poses a considerable public health burden in Bangladesh, where a high carrier frequency underlies widespread disease risk. It is necessary to distinguish β-thalassemia trait (βTT) and iron deficiency anemia (IDA) to ensure genetic counseling and enable effective prevention strategies. Despite the availability of various discriminant formulas and machine learning algorithms (MLAs), their comparative diagnostic performance within the Bangladeshi population has not been comprehensively investigated. This study aimed to assess different discriminant formulas and ML models as well as to propose novel combinations of formulas for population-specific screening of βTT.

Methods

In this cross-sectional study, we compared 47 discriminant formulas and 12 machine learning models to distinguish β-thalassemia trait from iron-deficiency anemia in 467 individuals (143 βTT, 324 anemia) drawn from a 2,514-participant cohort. DF-6 and DF-27 were two new formulas constructed by integrating high-performing formulas. Multi-criteria decision-making (MCDM) techniques, TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) and SECA (Simultaneous Evaluation of Criteria and Alternatives), provided the final ranking for performance. Cluster analysis was performed to identify groups with similar diagnostic performance.

Results

Population-specific optimal cut-off values were determined for the discriminant formulas. The newly proposed formulas, DF-6 and DF-27, ranked among the top ten performers alongside RBC, Janel (11T), Ravanbakhsh-F1, Srivastav, Alparslan, Hisham, Index 26, and Kerman I. DF-6 (AUC: 0.9707) achieved the best overall performance across the diagnostic metrics. DF-6 achieved the best overall performance (AUC: 0.98, 95% CI: 0.97–0.99, p < 0.0001). Assessment of ML models revealed that XGBoost (XGB) (AUC: 0.98, 95% CI: 0.97–0.99, p < 0.0001) and Support Vector Machine (SVM) (AUC: 0.97, 95% CI: 0.95–0.99, p < 0.0001) provided the highest diagnostic accuracy. The reliability of ensemble ML models was confirmed by MCDM and cluster analyses.

Conclusions

The combination of novel discriminant formula DF-6 and integration of XGB and SVM ML models can substantially strengthen nationwide screening programs to reduce the burden of thalassemia in Bangladesh.

1. Introduction

Thalassemia refers to a group of inherited blood disorders characterized by defective synthesis of globin chains. Among its types, β-thalassemia results from mutations that reduce or abolish the production of β-globin chains [1]. People with β-thalassemia major (β0/β0, β0/β + , and sometimes β + /β+) generally receive medical attention within the first two years of life. They necessitate regular red blood cell transfusions to survive [2]. The disorder is particularly concerning in countries within the global thalassemia belt [3]. Bangladesh lies within this belt, where β-thalassemia and hemoglobin E (HbE) are widespread [47]. The frequency of thalassemia carriers ranges from 10.9% to 13.3% [8]. The challenge is compounded by iron deficiency anemia (IDA), which often mimics β-thalassemia trait (βTT) in hematological profiles [911]. In such settings, early and accurate detection of carriers is essential to enable appropriate genetic counseling and break the chain of transmission. Gold-standard diagnostic methods such as hemoglobin electrophoresis, molecular analysis, iron profile provide definite results [5,7,9]. However, the cost and logistical demands make their uses unsuitable for large-scale screening programs. To overcome these barriers, various diagnostic formulas have been proposed worldwide, which are derived from complete blood count (CBC) indices [1214]. These formulas offer a rapid and inexpensive preliminary screening approach. However, their performance varies across populations due to differences in demographics, genetic backgrounds, and study methodologies [9]. Therefore, optimization of population-specific cut-off values for those formulas is mandatory. In parallel, machine learning algorithms (MLAs) have emerged as powerful tools for analysis of biomedical data. Classical classifiers (Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, Multilayer Perceptron, Linear Discriminant Analysis, K-Nearest Neighbors, and Gaussian Naive Bayes), as well as ensemble methods (Gradient Boosting, AdaBoost, XGBoost, and CatBoost), have improved diagnostic accuracy in thalassemia by identifying complex and nonlinear patterns within biomedical datasets [12,1521]. Moreover, multi-criteria decision-making (MCDM) techniques show high capacity to improve risk assessment of thalassemia. While traditional formula ranking lack multi-criteria evaluation, it fails to capture the highest performance [12,13,22].

However, there is limited literature of comparative analyses combining conventional indices and various machine learning algorithms in Bangladeshi cohort. In addition, the use of structured multi-criteria assessment systems has been rarely used to rank diagnostic methods in a systematic manner using several performance indicators at the same time. This methodological gap prevents the evidence-based choice of the best screening methods to be applied to the large-scale population [23,24]. Therefore, this study aimed to: (1) optimize cut-off values of existing discriminant formulas for detecting βTT in the Bangladeshi population; (2) derive and evaluate two new composite indices (DF-6, DF-27); (3) compare their performance with multiple MLAs based on routine CBC indices; (4) use multi-criteria decision-making methods such as TOPSIS and SECA to offer a systematic ranking of diagnostic tools as a screening method; and (5) group data points based on similarity using agglomerative hierarchical clustering.

Consequently, the study justifies the implementation of evidence-based and resource-efficient screening approaches in Bangladesh to reduce burden of thalassemia and improve patient care outcomes.

2. Methods

2.1. Enrollment of study participants and ethical clearance

In this cross-sectional study, blood samples were collected from 2,514 individuals through thalassemia carrier screening programs conducted at various sites in Bangladesh, including universities, medical colleges, and specialized hospitals, such as Bangladesh Thalassemia Samity Hospital, Kurmitola General Hospital, from 20 April 2022 to 31 March 2023. The samples were analyzed at the laboratory of Institute for Developing Science and Health Initiatives (ideSHi). This study was ethically approved under a research protocol (# PNR-22003) by the Institutional Review Board of the Institute for Developing Science and Health Initiatives (ideSHi). Participants provided informed assent or consent, and written consent was obtained from the participants, their parents, or legal guardians.

2.2. Analysis of complete blood count of the study participants

Two milliliters of venous blood were collected from each participant in an EDTA-containing vacutainer, and the collected blood samples were transported to the laboratory, maintained at 4–8°C, and preserved at 4°C until analysis. A complete blood count (CBC) was performed for each sample by an XS-800i Hematology Analyzer (Sysmex, Japan) following the manufacturer’s instructions. Major hematological parameters, such as hemoglobin (HGB), hematocrit (HCT), red blood cell (RBC) count, and red cell indices, including the mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), and red cell distribution width (RDW), were considered for analysis.

2.3. Hemoglobin electrophoresis

Hemoglobin electrophoresis was performed on an automated capillary electrophoresis system (Sebia, France) using a capillary hemoglobin (E) kit to measure HbA, HbA2, HbF, and other abnormal Hb variants following the manufacturer’s instructions.

2.4. Screening of the population for diagnostic performance analysis

Among the 2,514 samples, hemoglobin electrophoresis determined 1999 to be normal,143 had beta thalassemia trait (βTT), 5 had delta-beta thalassemia trait (δ-βTT), 324 had hemoglobin E trait (HbET), 7 had HbE disease, 11 had hereditary persistence of fetal hemoglobin (HPFH), 8 had hemoglobin D (HbD), 4 had hemoglobin S (sickle cell trait), 1 had hemoglobin H (HbH), 1 had beta thalassemia major (βTM), 5 had HbE-beta thalassemia (HbE-βT), 2 had compound heterozygote of HbE-β and HbD (HbE-βT-HbD) and 4 had rare hemoglobin variants. Out of the 1999 samples, 324 were classified as anemic. The anemic group had hemoglobin (HGB) levels <12 and <13 g/dL for women and men, respectively. MCV and MCH were <80 fL and <27 pg for both genders, respectively. The βTT group had an MCV value <80 fL and an MCH value <27 pg. HbA2 > 3.5% was considered as βTT. The selection and classification were conducted in accordance with the WHO diagnostic guidelines [25]. Diagnostic performance analyses of formulas and ML models focused on the 467 individuals (143 βTT, 324 anemia). Fig 1 shows overview for sampling strategy and final selection of analytic datasets.

thumbnail
Fig 1. Study design for diagnostic performance analysis.

https://doi.org/10.1371/journal.pone.0350387.g001

2.5. Optimization of population specific cut-off values for discriminant formulas

For each discriminant formula, the optimal population-specific cut-off value was determined using receiver operating characteristic (ROC) curve analysis in the analytic dataset. The optimal cut-off value was selected by maximizing Youden’s Index (J) which is given in (1).

(1)

where sensitivity represents the true positive rate and specificity represents the true negative rate. The cut-off value corresponding to the maximum J was considered optimal [9].

2.6. Derivation and validation of new discriminant formulas

Along with the existing 45 CBC derived discriminant formulas, two new formulas, DF-6 and DF-27, were introduced to improve diagnostic performance, by combining the high performing established formulas (Table 1). For each constituent formula, a binary classification score was assigned shown in (2).

thumbnail
Table 1. Discriminant formulas applied for the evaluation of the diagnosis of β-thalassemia trait (βTT) in the Bangladeshi population.

https://doi.org/10.1371/journal.pone.0350387.t001

(2)

Where Ik denotes the binary output of the k-th discriminant formula.

DF-6 was derived from a combination of RBC, Alparslan, Hisham, Kerman II, Ravanbakhsh-F1, and Srivastava indices. The individual scores were then summed to obtain a final composite score. DF-6 scoring was defined in (3).

(3)

DF-6 ranges from 0 to 6. A DF-6 score ≥4 classified an individual as βTT.

DF-27 was derived from RBC, Alparslan, Bordbar, Das Gupta, Ehsani, England and Fraser (E&F), Green and King (G&K), Hameed, Hisham, Jayabose, Keikhaei, Kerman I, Kerman II, Mentzer, Merdin-1, Merdin-2, Nishad (Thal), Ravanbakhsh-F1, Ravanbakhsh-F4, Roth, SCSBTT, Sehgal, Shine and Lal (S&L), Sirdah, Srivastava, TI (MDHL), and Wongprachum. DF-27 scoring was defined as sum of these indices as shown in (4).

(4)

Where Ik represents the binary classification outcome of each constituent formula.

DF-27 score ranges from 0 to 27. A DF-27 score ≥16 classified an individual as βTT. The optimal cut-off values were determined using ROC curve analysis by maximizing Youden’s Index. The performance of DF-6 and DF-27 was validated using 324 anemic cases and 143 βTT cases.

2.7. Assessment of diagnostic performance

The performance of major hematological indices from the CBC, 47 discriminant formulas, and 12 machine learning models, was assessed using the following 12 measures: sensitivity, specificity, false negative rate, false positive rate, positive predictive value, negative predictive value, Youden’s Index, accuracy, positive likelihood ratio (LR+), negative likelihood ratio (LR−), diagnostic odds ratio (DOR), and area under the curve (AUC) [9,56]. The equations were as follows:

(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)

To ensure statistical robustness and reduce optimistic bias, 95% confidence intervals (CI) were calculated for key diagnostic measures. For Sensitivity, specificity, accuracy, FPR, FNR, PPV, and NPV, confidence intervals were derived using Wilson score. For AUC, Youden Index, LR + , LR-, DOR, confidence intervals were derived using non-parametric bootstrap resampling.

2.7.1. Data preprocessing.

We restricted ML analyses to the anemia vs βTT subset (n = 467) and trained binary classifiers (βTT vs anemia). In one set of models, features were CBC indices alone; in a second set, the 47 discriminant formulas were used as features. Major hematological parameters, such as HGB, HCT, RBC count, and red cell indices, including MCV, MCH, MCHC, and RDW-CV, and discriminant indices were preprocessed (Fig 2).

2.7.2. Machine learning model and feature selection.

The following supervised machine learning models were selected for the assessment of diagnostic performance: Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Decision Tree (DT), Gaussian Naive Bayes (GNB), Gradient Boosting Machine (GBM), XGBoost (XB), LightGBM (LGBM), Multi-layer perceptron (MLP), Linear Discriminant Analysis (LDA), and AdaBoost (ADA). CBC parameters and 47 discriminant indices were selected as feature to be analysed. Class imbalance was addressed using cost-sensitive learning (class_weight = “balanced”) where applicable.

2.7.3. Cross-validation and resampling strategy.

The machine learning workflow is summarized in Fig 2. Model evaluation was performed using stratified 5-fold cross-validation to preserve class distribution across folds. In each iteration, approximately 80% of the data were used for training and 20% for validation. To mitigate class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied to the training data within each fold. SMOTE resulted in an approximately balanced class distribution with a 1:1 ratio. The validation data remained untouched to prevent information leakage. This fold-wise procedure ensured that estimation of model performance was unbiased and generalizable, thereby minimized the risk of overfitting.

2.7.4. Feature scaling.

Feature scaling was conducted using StandardScaler within each fold. The scaler was fitted only on the training subset and subsequently applied to the corresponding validation fold, thereby preventing data leakage.

2.7.5. Model training, threshold optimization, and performance aggregation.

Within each cross-validation fold, models were trained on SMOTE-balanced training data. For probabilistic classifiers, predicted probabilities were obtained for the validation fold. Optimal classification thresholds were determined using Youden’s Index derived from ROC analysis. Binary predictions were then generated using these optimal thresholds. After completion of all folds, predictions from the validation subsets were pooled to construct a global confusion matrix for each model. Performance metrics and their corresponding confidence intervals were calculated from these pooled predictions, providing stable and unbiased overall estimates. Mean AUC values were reported with bootstrap-derived 95% confidence intervals as illustrated in Fig 2.

2.8. Ranking by multi-criteria decision-making (MCDM) methods

The CBC indices, discriminant formulas, and ML models were ranked by MCDM methods, TOPSIS and SECA. TOPSIS evaluated and ranked on their closeness to the ideal and anti-ideal solutions. The best solution was considered the maximum of each criterion, and the worst solution was the minimum. The TOSIS score was calculated for the ranking of the models; the higher the score, the better the results. SECA analysed multiple criteria and alternatives simultaneously by maximizing the overall performance [5759].

2.9. Hierarchical clustering

The Ward linkage method was used for hierarchical clustering. This method with Euclidean distance, minimized the total variance within each cluster and created groups of formulas that were as internally homogeneous as possible in terms of their performance profiles.

2.10. Statistical analysis

The statistical analyses were performed using SPSS and GraphPad Prism software (version 9.0), R, and Python programming. Non-parametric and parametric tests, including the Kruskal-Wallis test, Welch’s t-test, and Mann-Whitney U test were performed. For all tests, a two-tailed P value <0.05 was considered statistically significant. Machine learning model analyses were conducted in Python using the Scikit-learn, Imbalanced-learn, XGBoost, LightGBM, and Statsmodels libraries. Bootstrap confidence interval estimation and Wilson score interval calculations were performed using SciPy and Statsmodels modules.

3. Results

3.1. Descriptive statistics

In this study, major hematological parameters, such as HGB, HCT, RBC count, and red cell indices, including MCV, MCH, MCHC, and RDW-CV, were considered for analysis of 2,514 individuals. The variation in hematological parameters among 13 groups were shown in supporting Fig 3. Performing Kruskal-Wallis test, it was found that RBC, HGB, MCV, MCH, MCHC, and RDW-CV were significantly different (p-value <0.0001) among the 13 groups. RBC and MCHC remained the highest in HbH and the lowest in βTM. HGB and MCV were the highest in the normal group, while the lowest values were observed in βTM and HbH, respectively. MCH was the highest in the HPFH group and the lowest in the HbH group. RDW-CV was the highest in HbE-βT-HbD and the lowest in normal. The comparison of the parameters in the βTT and anemic group was illustrated in Fig 4 where RBC, HGB, HCT, and RDW-CV were lower in the anemic group than in the βTT group, while MCV, MCH, and MCHC were higher in the anemic group than in the βTT group. Welch’s t-test was performed, and the p-value was statistically significant (p-value <0.0001). In hemoglobin electrophoresis, HbA and HbA2 were statistically different (p-value <0.0001) in the anemic and βTT groups. HbA2 was significantly lower in the anemic group than in the βTT group (Fig 5). Fig 6 shows the age and gender-wise distribution of individuals. Females accounted for the majority of anemia cases, as indicated by a significantly higher prevalence of anemia among females. Males showed comparatively higher proportions of βTT compared to females, as displayed by the stronger purple flows toward males.

thumbnail
Fig 3. Distribution pattern of major hematological parameters in different groups.

https://doi.org/10.1371/journal.pone.0350387.g003

thumbnail
Fig 4. Distribution pattern of major hematological parameters in anemic and βTT groups.

https://doi.org/10.1371/journal.pone.0350387.g004

thumbnail
Fig 5. Hemoglobin electrophoresis results for anemia and βTT cases.

(A) Comparison of HbA (%) in anemia and βTT cases and (B) Comparison of HbA2 (%) in anemia and βTT cases.

https://doi.org/10.1371/journal.pone.0350387.g005

thumbnail
Fig 6. The alluvial plot represented the distribution of anemia and β-thalassemia trait (βTT) groups.

The color codes indicated different groups: anemia (red), βTT (purple).

https://doi.org/10.1371/journal.pone.0350387.g006

3.2. Assessment of diagnostic performance of CBC parameters, discriminant formulas, and machine learning models

The performance of CBC parameters, discriminant formulas, and machine learning models was evaluated for the detection of 143 βTT cases, compared with 324 anemic individuals. The cut-off values for most formulas were optimized for the Bangladeshi population to improve the differentiation between βTT and anemia (Table 1).

Analysis of individual CBC parameters (Table 2 and S1Table) showed that RBC had the best performance (AUC = 0.98; 95% CI: 0.96–0.99; p < 0.0001, DOR = 260; 95% CI: 131.27–702.91), clearly outperforming MCV and MCH.

thumbnail
Table 2. Sensitivity, specificity, and Youden’s Index, ROC-AUC, accuracy, and DOR of CBC parameters and ML models fitted to CBC parameters with their 95% confidence interval.

https://doi.org/10.1371/journal.pone.0350387.t002

Then, machine learning models trained on CBC parameters demonstrated remarkable diagnostic performance for βTT detection using 7 CBC parameters (Table 2). XGB achieved the highest accuracy and specificity, with the best PPV, LR + , and diagnostic odds ratio (505.57, 95% CI: 231.39–2437.78), demonstrating its strong ability for confirming βTT cases. In our dataset including 143 βTT and 324 anemia cases, RBC misclassified 13 βTT and 12 anemia cases, while XGB misclassified 6 βTT and 14 anemia cases, indicating only modest gains of XGB over RBC for βTT classification.

Table 3 and S2 Table summarize the diagnostic metrics of discriminant formulas and ML models fitted to discriminant formulas. Two new formulas, DF-6 and DF-27, demonstrated reliable diagnostic performance.

thumbnail
Table 3. Sensitivity, specificity, and Youden’s Index, ROC-AUC, accuracy, and DOR of discriminant formulas and ML models fitted to discriminant formulas with their 95% confidence interval.

https://doi.org/10.1371/journal.pone.0350387.t003

Among the existing formulas, Kerman I displayed the highest sensitivity, NPV, and the lowest FNR. RBC achieved the highest specificity and the lowest FPR. DF-6 exhibited the highest Youden’s index at its optimal cut-off of ≥4. Overall, DF-6 exhibited the best diagnostic performance across accuracy, DOR (329.68, 95% CI: 161.21–1025.13) at a cut-off of ≥4. Notably, the highest ROC-AUC value was achieved by both DF-6 (0.98, 95% CI: 0.97–0.99, p < 0.001) and RBC, confirming their strong diagnostic performance (Fig 7A).

thumbnail
Fig 7. Graphical overview of the performance of the top ten discriminant formulas.

(A) ROC curves for each discriminant formula, (B) Comparison of sensitivity, specificity, Youden’s Index, accuracy, TOPSIS score, and SECA score across the top ten discriminant formulas.

https://doi.org/10.1371/journal.pone.0350387.g007

Assessment of ML models trained on 47 discriminant formulas constantly exhibited high performance for the detection of βTT cases. LGBM attained the highest ROC-AUC and demonstrated high sensitivity and specificity. XGB exhibited the highest sensitivity and NPV. SVM demonstrated superior discriminative performance with the highest accuracy and DOR (419.52, 95% CI: 206.56–2180.43), achieving both sensitivity and specificity above 95%.

In the dataset, DF-6 misclassified 9 βTT and 14 anemia cases, while SVM misclassified 5 βTT and 20 anemia cases, indicating only modest gains of SVM over DF-6 for βTT classification.

3.3. Ranking by multi-criteria decision making (MCDM) analysis

3.3.1. MCDM analysis on CBC parameters and discrimination formulas.

In MCDM analysis of CBC parameters and discrimination indices, DF-06 ranked first across TOPSIS and SECA methods, obtaining the highest scores, and highlighting its strong diagnostic performance. RBC ranked second with a high TOPSIS and SECA scores. Janel (11T), Ravanbakhsh-F1, Srivastava, Alparslan, DF27, Hisham, Index 26, Kerman I ranked within the top ten, indicating their reliable performance (Fig 7B) and (Table 4).

thumbnail
Table 4. MCDM ranking of CBC parameters and discriminant formulas.

https://doi.org/10.1371/journal.pone.0350387.t004

3.3.2. MCDM analysis on ML models.

The performance of ML models across CBC parameters was evaluated by MCDM analysis. According to the TOPSIS and SECA scores, XGB constantly attained the highest overall performance, followed by SVM, LGBM, and GBM. The assessment of ML models across 47 discriminant formulas revealed that SVM emerged as the top-performing model for βTT detection, followed by LR, XGB, GBM (Table 5).

thumbnail
Table 5. MCDM ranking of machine learning models for CBC parameters and discriminant formulas.

https://doi.org/10.1371/journal.pone.0350387.t005

3.4. Hierarchical clustering

The hierarchical clustering dendrogram was constructed using Ward’s linkage method with Euclidean distance. The number of clusters was determined, the dendrogram was examined at a Ward’s linkage distance threshold of approximately 5.0. At this threshold, CBC parameters and discriminant indices were classified into two main groups for their diagnostic performance.

Group 1 included strong to moderate performing indices such as DF-06, RBC, Ravanbakhsh-F1, Srivastav, Janel (11T), Alparslan, Hisham, Index 26, DF-27, and Kerman I which (<2.5) formed tighter subclusters at lower linkage distances (<2.5). Conversely, group 2 comprised weak-performing indices such as Cruise, Huber–Herklotz, Pornprasert, Sargolzaie, and TI (MCHD), which clustered at higher linkage distances (>5.0). Their performance profiles were significantly different from the more reliable indices (Fig 8), which was also reflected by lower MCDM rankings. There were two major clusters in the dendrogram of ML models for CBC parameters at a linkage distance threshold of approximately 3.0. Cluster 1 included DT, GNB, MLP, KNN, LR, and LDA which demonstrated moderate diagnostic similarity. Cluster 2 included ADA, RF, GBM, LGBM, and SVM, and clustered at a lower linkage distance and showed high similarities in diagnostic metrics. XGB formed a separate sub-cluster between the two clusters due to its slightly different performance metrics, reflecting the highest overall performance.

thumbnail
Fig 8. Hierarchical clustering dendrogram for CBC parameters and discriminant formulas, including homogeneous groups with similar diagnostic performance.

https://doi.org/10.1371/journal.pone.0350387.g008

Also, for 47 formulas, clustering analysis of ML models displayed two distinct clusters. The first cluster included LR, RF, LDA, K-NN, SVM, and MLP, which exposed strong similarity in their diagnostic metrics. The second cluster included GBM, LGBM, ADA, XGB, GNB, and DT. Among these, the ensemble boosting methods: GBM, LGBM, ADA, and XGB formed a closely linked subcluster, reflecting their consistent diagnostic outcome. In contrast, DT formed a different branch at the highest linkage distance, indicating comparatively weaker performance (Figs 9 and 10).

thumbnail
Fig 9. Hierarchical clustering dendrogram of ML models for CBC parameters, including homogeneous groups with similar diagnostic performance.

https://doi.org/10.1371/journal.pone.0350387.g009

thumbnail
Fig 10. Hierarchical clustering dendrogram of ML models for discriminant formulas, including homogeneous groups with similar diagnostic performance.

https://doi.org/10.1371/journal.pone.0350387.g010

4. Discussion

The study illustrates a pathway that provides more accurate, and population-tailored diagnostic approaches for large scale screening of βTT and anemia in Bangladesh. The novelty of this work lies in introduction of DF-6 and DF-27 as optimized formulas and integration of 12 ML models to capture nonlinear relationships within hematological data. The newly developed composite index, DF-6 and ML models such as XGB and SVM can distinguish βTT from anemia in a Bangladeshi cohort with AUCs around 0.97–0.98, outperforming most established indices. Among individual hematological parameters, RBC was the most reliable CBC-based predictor of βTT, which is consistent with previous studies [18]. However, DF-6 and ML models (XGB and SVM) offered optimal balance between sensitivity and specificity. The genetic defects in β-globin synthesis trigger ineffective erythropoiesis in βTT while anemia is associated with nutritional deficiencies. DF-6 integrates combination of formulas with major CBC parameters that reflect the significant physiological differences between βTT and anemia. RBC, HGB, HCT, and RDW-CV were lower while MCV, MCH, and MCHC were higher in the anemic group than in the βTT group. High RBC count reflects ineffective erythropoiesis, while reduced MCV and MCH indicate microcytosis and hypochromia in βTT. Therefore, biological relevant parameters influenced distinct pathophysiological patterns of βTT and anemia, which explains superior discriminatory performance of DF-6. Several previously proposed indices such as S & L, SCSBTT, Roth, Index26, and Cruise indices demonstrated strong performance in different populations [1214,17]. However, in this study, the combination of formulas provided better diagnostic performance than individual formulas. The top-performing formulas (DF-6, RBC, Janel (11T), Ravanbakhsh-F1, Srivastava, Alparslan, DF27, Hisham, Index 26, and Kerman I) demonstrated better and more balanced outcomes compared to some other studies [12,13,17]. Kandhro et.al previously reported RBC, RDW, S&L, EF, Srivastav, Sirdah, G&K, RDWI, Huber–Herklotz, Kerman1, Kandhro1, Kandhro2 indices obtained 100% sensitivity and specificity [53]. However, no other study could reproduce these results [4,1012,6063]. In contrast, the present study observed diminished performance for Kandhro-1, Kandhro-2, and Ricerca, while indices such as Cruise, Huber-Herklotz, Pornprasert, and Sargolzaie showed limited diagnostic utility in the Bangladeshi population. These findings highlighted that the influence of population-specific hematological profiles on the discriminatory power of the widely used indices. Genetic heterogeneity increases complexity as different communities inherit different β-thalassemia mutations, each with variable effects on hematological parameters. Some mutations lead to easily identifiable changes in red cell indices, while others result in milder or atypical patterns, restraining the consistency of some discriminant formulas [4,10,11,6063]. Additionally, coexisting conditions, including iron deficiency, α-thalassemia, E/β-thalassemia, hemoglobin S/β-thalassemia and deficiencies in vitamin B12 or folate, can further alter typical hematological parameters, reducing the effectiveness of conventional screening tools. Smaller, homogeneous study cohorts may report higher efficacy of standard indices, but larger and more diverse datasets tend to reveal their limitations [12,44,53]. These are the common challenges reported in the population of many countries, including Pakistan [53,64], India [65], Iran [66], Turkey [67] and Saudi Arabia [68]. In South Asia, extreme heterogeneity, has been reported throughout different parts of India [13]. Pakistan also reflected significant genetic diversity due to migrations, invasions, and commercial interactions [53]. Conversely, in recent years, ML models enhanced the detection of βTT by capturing complex, nonlinear relationships within data. The strong diagnostic performance of XGB and SVM is consistent with previous reports [7,20,69]. Also, in some recent studies [1921], boosting algorithms (ADA, GB, and XGB) performed well in the detection of thalassemia. Therefore, integration of ML models with population-tailored discriminant formulas may play a significant role to improve diagnostic accuracy for inherited and nutritional anemias in large populations [12,44].

In Bangladesh, DF-6 ≥ 4 could be prioritized for confirmatory hemoglobin electrophoresis in premarital or antenatal screening programs. While, DF-6 < 4 could be prioritized for monitoring iron profile. Similarly, ML models, XGB, SVM could be integrated into laboratory software systems to enable rapid and cost-effective screening in resource-limited settings. These approaches exert significant potential to alleviate the healthcare burden posed by hemoglobinopathies, particularly in low- and middle-income countries where access to comprehensive screening is limited [12,17].

4.1. Limitations

Certain limitations of the study should be acknowledged. The study sample were collected from university and tertiary care settings, which may not represent rural primary-care populations. In addition, CBC and hemoglobin electrophoresis were used as the primary confirmatory test. Incorporating iron profile and molecular confirmation would further improve diagnostic performance of discriminant formulas and ML models by reducing the risk of misclassification. Furthermore, the cut-off values for discriminant formulas were optimized specifically for the Bangladeshi population. Therefore, the applicability of the formulas and ML models to other South Asian populations or global cohorts requires external validation. Despite the advancements, some intrinsic challenges and areas for future research are vital for the evolution and clinical implementation of these technologies. To accelerate broader acceptance, future research should emphasize improving the interpretability of ML models using SHapley Additive exPlanations (SHAP) and Local Interpretable Model Agnostic Explanation (LIME). Additionally, user-friendly interfaces should be developed that enable their integration into routine clinical practice [17].

5. Conclusion

The study demonstrated that population specific discriminant formulas, DF-6 and advanced machine learning models, XGB and SVM accurately distinguish β-thalassemia trait from other anemias within the Bangladeshi population. The results emphasized the need for population-specific optimization of cut-off values for the formulas. High-performing CBC-based discriminant formulas and powerful ML models enhanced the diagnostic accuracy of βTT detection by addressing the genetic diversity and diagnostic challenges. Notably, the application of proposed XGB and SVM models require minimal computational facility. It can perform analysis on standard laboratory computers without specialized hardware. Therefore, this approach can be feasible for integration into routine diagnostic facilities of laboratories in Bangladesh. These cost-effective, scalable approaches can significantly improve early detection and management of β-thalassemia carriers in resource-limited settings, ultimately optimizing healthcare resources in low- and middle-income countries.

Supporting information

S1 Table. True positive (TP), true negative (TN), false positive (FP), and false negative (FN), false positive rate (FPR), false negative rate (FNR), positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+), negative likelihood ratio (LR-) of CBC parameters and ML models fitted to CBC parameters with their 95% confidence interval.

https://doi.org/10.1371/journal.pone.0350387.s001

(DOCX)

S2 Table. True positive (TP), true negative (TN), false positive (FP), and false negative (FN), false positive rate (FPR), false negative rate (FNR), positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+), negative likelihood ratio (LR-) of discriminant formulas and ML models fitted to discriminant formulas with their 95% confidence interval.

https://doi.org/10.1371/journal.pone.0350387.s002

(DOCX)

Acknowledgments

We are grateful to the collaborators for their contribution and support during thalassemia screening. We are thankful to Sompritee Woopoma Chowdhury (Molecular Genetics Specialist, University of Toronto), Safa Faria, and thalassemia screening team members of ideSHi for their technical support.

References

  1. 1. Fibach E, Rachmilewitz EA. Pathophysiology and treatment of patients with beta-thalassemia - an update. F1000Res. 2017;6:2156. pmid:29333256
  2. 2. Taher AT, Saliba ANJH. Iron overload in thalassemia: different organs at different rates. the American Society of Hematology Education Program Book. 2017. pp. 265–71.
  3. 3. Islam MT, Sarkar SK, Sultana N, Begum MN, Bhuyan GS, Talukder S, et al. High resolution melting curve analysis targeting the HBB gene mutational hot-spot offers a reliable screening approach for all common as well as most of the rare beta-globin gene mutations in Bangladesh. BMC Genet. 2018;19(1):1. pmid:29295702
  4. 4. Hossain MS, Raheem E, Sultana TA, Ferdous S, Nahar N, Islam S, et al. Thalassemias in South Asia: clinical lessons learnt from Bangladesh. Orphanet J Rare Dis. 2017;12(1):93. pmid:28521805
  5. 5. Noor FA, Sultana N, Bhuyan GS, Islam MT, Hossain M, Sarker SK, et al. Nationwide carrier detection and molecular characterization of β-thalassemia and hemoglobin E variants in Bangladeshi population. Orphanet J Rare Dis. 2020;15(1):15. pmid:31941534
  6. 6. Khan W, Banu B, Amin S, Selimuzzaman M, Rahman M, Hossain B. Prevalence of beta thalassemia trait and Hb E trait in Bangladeshi school children and health burden of thalassemia in our population. Bangladesh Med Res Counc Bull. 2005;21(1):1–7.
  7. 7. Chowdhury MA, Sultana R, Das DJH. Thalassemia in Asia 2021 overview of thalassemia and hemoglobinopathies in Bangladesh. Bangladesh J Hematol. 2022;46(1):7–9.
  8. 8. Hossain MS, Islam F, Akhter S, Al Mossabbir A. Thalassemia in Bangladesh: progress, challenges, and a strategic blueprint for prevention. Orphanet J Rare Dis. 2025;20(1):358. pmid:40640823
  9. 9. Jahangiri M, Rahim F, Malehi AS. Diagnostic performance of hematological discrimination indices to discriminate between βeta thalassemia trait and iron deficiency anemia and using cluster analysis: Introducing two new indices tested in Iranian population. Sci Rep. 2019;9(1):18610. pmid:31819078
  10. 10. Kundu S, Alam SS, Mia MA-T, Hossan T, Hider P, Khalil MI, et al. Prevalence of anemia among children and adolescents of Bangladesh: a systematic review and meta-analysis. Int J Environ Res Public Health. 2023;20(3):1786. pmid:36767153
  11. 11. Vehapoglu A, Ozgurhan G, Demir AD, Uzuner S, Nursoy MA, Turkmen S. Hematological indices for differential diagnosis of beta thalassemia trait and iron deficiency anemia. 2014;2014(1):576738.
  12. 12. Das R, Saleh S, Nielsen I, Kaviraj A, Sharma P, Dey K, et al. Performance analysis of machine learning algorithms and screening formulae for β-thalassemia trait screening of Indian antenatal women. Int J Med Inform. 2022;167:104866. pmid:36174416
  13. 13. Jain AK, Sharma P, Saleh S, Dolai TK, Saha SC, Bagga R. Multi-criteria decision making to validate performance of RBC-based formulae to screen β-thalassemia trait in heterogeneous haemoglobinopathies. BMC Med Inf Decision Mak. 2024;24(1):5. pmid:38167309
  14. 14. Shuang X, Zhenming W, Zhu M, Si S, Zuo L. New logarithm-based discrimination formula for differentiating thalassemia trait from iron deficiency anemia in pregnancy. BMC Pregnancy Childbirth. 2023;23(1):100. pmid:36755221
  15. 15. Ferih K, Elsayed B, Elshoeibi AM, Elsabagh AA, Elhadary M, Soliman A, et al. Applications of artificial intelligence in thalassemia: a comprehensive review. Diagnostics (Basel). 2023;13(9):1551. pmid:37174943
  16. 16. Amendolia SR, Cossu G, Ganadu ML, Golosio B, Masala GL, Mura GM. A comparative study of K-nearest neighbour, support vector machine and multi-layer perceptron for thalassemia screening. Chemometr Intell Lab Syst. 2003;69(1–2):13–20.
  17. 17. Abdulkarim D, Abdulazeez AM. Machine learning-based prediction of thalassemia: a review. IJCS. 2024;13(3).
  18. 18. Tepakhan W, Srisintorn W, Penglong T, Saelue P. Machine learning approach for differentiating iron deficiency anemia and thalassemia using random forest and gradient boosting algorithms. Sci Rep. 2025;15(1):16917. pmid:40374805
  19. 19. Saleem M, Aslam W, Lali MIU, Rauf HT, Nasr EA. Predicting thalassemia using feature selection techniques: a comparative analysis. Diagnostics (Basel). 2023;13(22):3441. pmid:37998577
  20. 20. Nasir MU, Naseem MT, Ghazal TM, Zubair M, Ali O, Abbas S, et al. A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets. Sci Rep. 2025;15(1):13359. pmid:40246871
  21. 21. Devanath A, Akter S, Karmaker P, Sattar A. Thalassemia prediction using machine learning approaches. 2022 6th International Conference on Computing Methodologies and Communication (ICCMC). IEEE; 2022.
  22. 22. Johara Chowdhury S, Mahmud T, Tasnim F, Sharmin S, Nawal S, Papri UH, et al. A hybrid MCDM and machine learning framework for thalassemia risk assessment in pregnant women. Diagnostics (Basel). 2025;15(22):2833. pmid:41300858
  23. 23. Hoffmann JJML, Urrechaga E, Aguirre U. Discriminant indices for distinguishing thalassemia and iron deficiency in patients with microcytic anemia: a meta-analysis. Clin Chem Lab Med. 2015;53(12):1883–94. pmid:26536581
  24. 24. Heriot GS, Cheng AC, Tong SYC, Liew D. Clinical predictors and prediction rules to estimate initial patient risk for infective endocarditis in Staphylococcus aureus bacteraemia: attention must be paid to the reference standard. Clin Microbiol Infect. 2018;24(3):314–6. pmid:29030169
  25. 25. Beutler E, Waalen J. The definition of anemia: what is the lower limit of normal of the blood hemoglobin concentration? Blood. 2006;107(5):1747–50. pmid:16189263
  26. 26. England JM, Fraser PM. Differentiation of iron deficiency from thalassaemia trait by routine blood-count. Lancet. 1973;1(7801):449–52. pmid:4120365
  27. 27. Mentzer WC. Differentiation of iron deficiency from thalassæmia trait. The Lancet. 1973;301(7808):882.
  28. 28. Srivastava PC, Bevington JM. Iron deficiency and-or thalassaemia trait. Lancet. 1973;1(7807):832. pmid:4121255
  29. 29. Shine I, Lal S. A strategy to detect β-thalassæmia minor. The Lancet. 1977;309(8013):692–4.
  30. 30. Bessman JD, Feinstein DI. Quantitative anisocytosis as a discriminant between iron deficiency and thalassemia minor. 1979.
  31. 31. Ricerca BM, Storti S, d’Onofrio G, Mancini S, Vittori M, Campisi S, et al. Differentiation of iron deficiency from thalassaemia trait: a new approach. Haematologica. 1987;72(5):409–13. pmid:3121463
  32. 32. Green R, King R. A new red cell discriminant incorporating volume dispersion for differentiating iron deficiency anemia from thalassemia minor. Blood Cells. 1989;15(3):481–91; discussion 492-5. pmid:2620095
  33. 33. Jayabose S, Giamelli J, LevondogluTugal O, Sandoval C, Ozkaynak F, Visintainer P. #262 Differentiating iron deficiency anemia from thalassemia minor by using an RDW-based index. J Pediatric Hematol/Oncol. 1999;21(4):314.
  34. 34. Sirdah M, Tarazi I, Al Najjar E, Al Haddad R. Evaluation of the diagnostic reliability of different RBC indices and formulas in the differentiation of the beta-thalassaemia minor from iron deficiency in Palestinian population. Int J Lab Hematol. 2008;30(4):324–30. pmid:18445163
  35. 35. Ehsani MA, Shahgholi E, Rahiminejad MS, Seighali F, Rashidi A. A new index for discrimination between iron deficiency anemia and beta-thalassemia minor: results in 284 patients. Pak J Biol Sci. 2009;12(5):473–5. pmid:19579993
  36. 36. Sirachainan N, Iamsirirak P, Charoenkwan P, Kadegasem P, Wongwerawattanakoon P, Sasanakul W, et al. New mathematical formula for differentiating thalassemia trait and iron deficiency anemia in thalassemia prevalent area: a study in healthy school-age children. Southeast Asian J Trop Med Public Health. 2014;45(1):174–82. pmid:24964667
  37. 37. Gupta A, Hegde C, Mistri R. Red cell distribution width as a measure of severity of iron deficiency in iron deficiency anaemia. 1994.
  38. 38. Telmissani OA, Khalil S, Roberts GT. Mean density of hemoglobin per liter of blood: a new hematologic parameter with an inherent discriminant function. JLH. 1999;5:149–52.
  39. 39. Huber A, Ottiger C, Risch L, Regenass S, Hergersberg M, Herklotz R. Thalassämie-Syndrome: Klinik und Diagnose Syndromes thalassémiques: clinique et diagnostic. Schweiz Mediz Forum; 2004.
  40. 40. Cohan N, Ramzi MJ. Evaluation of sensitivity and specificity of Kerman index I and II in screening beta thalassemia minor. J JoIBT. 2008;4(4):297–302.
  41. 41. Keikhaei BJ. A new valid formula in differentiating iron deficiency anemia from ß-thalassemia trait. Pak J Med Sci. 2010;26:368–73.
  42. 42. Nishad AAN, Pathmeswaran A, Wickramasinghe AR, Premawardhena A. The Thal-Index with the BTT prediction.exe to discriminate β-thalassaemia traits from other microcytic anaemias. Thalassemia Rep. 2012;2(1):e1.
  43. 43. Wongprachum K, Sanchaisuriya K, Sanchaisuriya P, Siridamrongvattana S, Manpeun S, Schlep FP. Proxy indicators for identifying iron deficiency among anemic vegetarians in an area prevalent for thalassemia and hemoglobinopathies. Acta Haematol. 2012;127(4):250–5. pmid:22572177
  44. 44. Sehgal K, Mansukhani P, Dadu T, Irani M, Khodaiji S. Sehgal index: a new index and its comparison with other complete blood count-based indices for screening of beta thalassemia trait in a tertiary care hospital. Indian J Pathol Microbiol. 2015;58(3):310–5. pmid:26275252
  45. 45. Sargolzaie N, Miri-Moghaddam E. A local equation for differential diagnosis of β-thalassemia trait and iron deficiency anemia by logistic regression analysis in Southeast Iran. Hemoglobin. 2014;38(5):355–8. pmid:25155260
  46. 46. Pornprasert S, Panya A, Punyamung M, Yanola J, Kongpan C. Red cell indices and formulas used in differentiation of β-thalassemia trait from iron deficiency in Thai school children. Hemoglobin. 2014;38(4):258–61. pmid:24985744
  47. 47. Plengsuree S, Punyamung M, Yanola J, Nanta S, Jaiping K, Maneewong K, et al. Red cell indices and formulas used in differentiation of β-thalassemia trait from iron deficiency in Thai adults. Hemoglobin. 2015;39(4):235–9. pmid:26076394
  48. 48. Bordbar E, Taghipour M, Zucconi BE. Reliability of different RBC indices and formulas in discriminating between β-thalassemia minor and other microcytic hypochromic cases. Mediterr J Hematol Infect Dis. 2015;7(1):e2015022. pmid:25745549
  49. 49. Getta HA, Yassen H, Said HM. Hi & Ha, are new indices in differentiation between iron deficiency anemia and beta-thalassaemia trait. J Med Health Sci. 2015;14:67–72.
  50. 50. Matos JF, Dusse LMS, Borges KBG, de Castro RLV, Coura-Vital W, Carvalho MDG. A new index to discriminate between iron deficiency anemia and thalassemia trait. Rev Bras Hematol Hemoter. 2016;38(3):214–9. pmid:27521859
  51. 51. Ravanbakhsh M, Mousavi SA, Zare S. Diagnostic reliability check of red cell indices in differentiating iron deficiency anemia (IDA) from beta thalassemia minor (BTT). 2016;20(3).
  52. 52. Zaghloul A, Al-Bukhari TAMA, Bajuaifer N, Shalaby M, Al-Pakistani HA, Halawani SH, et al. Introduction of new formulas and evaluation of the previous red blood cell indices and formulas in the differentiation between beta thalassemia trait and iron deficiency anemia in the Makkah region. Hematology. 2016;21(6):351–8. pmid:26907523
  53. 53. Hafeez Kandhro A, Shoombuatong W, Prachayasittikul V, Nuchnoi P. New bioinformatics-based discrimination formulas for differentiation of thalassemia traits from iron deficiency anemia. Lab Med. 2017;48(3):230–7. pmid:28934514
  54. 54. Merdin AJAMM. Suggestion of new formulae to be used in distinguishing beta thalasemia trait from iron deficiency anemia. J Hematol. 2018;34:393–5.
  55. 55. Janel A, Roszyk L, Rapatel C, Mareynat G, Berger MG, Serre-Sapin A-F. Proposal of a score combining red blood cell indices for early differentiation of beta-thalassemia minor from iron deficiency anemia. Hematology. 2011;16(2):123–7. pmid:21418745
  56. 56. Šimundić A-M. Measures of diagnostic accuracy: basic definitions. EJIFCC. 2009;19(4):203–11. pmid:27683318
  57. 57. Chakraborty S, Raut RD, Rofin TM, Chakraborty S. A comprehensive review on applications of multi-criteria decision-making methods in healthcare waste management. Waste Manag Res. 2025;43(9):1335–57. pmid:40037384
  58. 58. Pandey V, Komal, Dincer H. A review on TOPSIS method and its extensions for different applications with recent development. Soft Comput. 2023;27(23):18011–39.
  59. 59. Keshavarz-Ghorabaee M, Amiri M, Zavadskas EK, Turskis Z, Antucheviciene J. Simultaneous Evaluation of Criteria and Alternatives (SECA) for multi-criteria decision-making. Informatica. 2018;29(2):265–80.
  60. 60. Verma S, Gupta R, Kudesia M, Mathur A, Krishan G, Singh SJ. Coexisting iron deficiency anemia and beta thalassemia trait: effect of iron therapy on red cell parameters and hemoglobin subtypes. S JISRN. 2014;2014(1):293216.
  61. 61. Alibakhshi R, Moradi K, Aznab M, Azimi A, Shafieenia S, Biglari M. The spectrum of β-thalassemia mutations in Hamadan Province, West Iran. Hemoglobin. 2019;43(1):18–22. pmid:31096791
  62. 62. Ibn Ayub M, Moosa MM, Sarwardi G, Khan W, Khan H, Yeasmin S. Mutation analysis of the HBB gene in selected Bangladeshi beta-thalassemic individuals: presence of rare mutations. Genet Test Mol Biomarkers. 2010;14(3):299–302. pmid:20406103
  63. 63. Sahoo SS, Biswal S, Dixit M. Distinctive mutation spectrum of the HBB gene in an urban eastern Indian population. Hemoglobin. 2014;38(1):33–8. pmid:24099628
  64. 64. Ansari SH, Shamsi TS, Ashraf M, Bohray M, Farzana T, Khan MT, et al. Molecular epidemiology of β-thalassemia in Pakistan: far reaching implications. Int J Mol Epidemiol Genet. 2011;2(4):403–8. pmid:22200002
  65. 65. Madan N, Sharma S, Sood SK, Colah R, Bhatia LHM. Frequency of β-thalassemia trait and other hemoglobinopathies in northern and western India. Indian J Hum Genet. 2010;16(1):16–25. pmid:20838487
  66. 66. Khorasani G, Kosaryan M, Vahidshahi K, Shakeri S, Nasehi MM. Results of the national program for prevention of beta-thalassemia major in the Iranian Province of Mazandaran. Hemoglobin. 2008;32(3):263–71. pmid:18473242
  67. 67. Keskin A, Türk T, Polat A, Koyuncu H, Saracoglu B. Premarital screening of beta-thalassemia trait in the province of Denizli, Turkey. Acta Haematol. 2000;104(1):31–3. pmid:11111119
  68. 68. AlHamdan NA, AlMazrou YY, AlSwaidi FM, Choudhry AJJG i M. Premarital screening for thalassemia and sickle cell disease in Saudi Arabia. Saudi Med J. 2007;9(6):372–7.
  69. 69. Fu Y-K, Liu H-M, Lee L-H, Chen Y-J, Chien S-H, Lin J-S, et al. The TVGH-NYCU thal-classifier: development of a machine-learning classifier for differentiating thalassemia and non-thalassemia patients. Diagnostics (Basel). 2021;11(9):1725. pmid:34574066