Machine learning-based prediction of the axial load capacity of UHPC strengthened reinforced concrete columns: A comparative analysis

Viet Hai Hoang; Minh Quang Tran; Van Thuc Ngo

doi:10.1371/journal.pone.0338120

Abstract

This study develops and evaluates machine learning (ML) models to predict the axial load capacity (Pu) of reinforced concrete (RC) columns strengthened with ultra-high-performance concrete (UHPC) jackets. A comprehensive experimental database containing 105 test samples with 17 key input parameters was compiled from the literature, representing the most extensive dataset of UHPC-jacketed RC columns to date. Using this database, a machine learning (ML) framework was established to predict the ultimate axial load capacity, employing six models: Extremely Randomized Trees (ER) model, K-Nearest Neighbors (KNN), Light Gradient Boosting Machine (LightGBM), Xgboost, CatBoost, and Cascade Forward Neural Networks (CFNNs). The CatBoost model achieved the best performance with R² = 0.983, MAE = 177 kN, and RMSE = 211 kN, significantly outperforming traditional design codes such as ACI 318 and EC2. In addition to high predictive accuracy, SHAP analysis was conducted to interpret the influence of each parameter, providing new insights into the mechanical behavior and governing factors of UHPC-jacketed RC columns. These findings highlight the capability of advanced ML to capture complex nonlinear effects more effectively than traditional methods. The proposed framework not only provides new insights into the mechanics of UHPC–RC columns but also offers a reliable predictive tool to support safer and more efficient design for strengthening.

Citation: Hoang VH, Tran MQ, Ngo VT (2026) Machine learning-based prediction of the axial load capacity of UHPC strengthened reinforced concrete columns: A comparative analysis. PLoS One 21(1): e0338120. https://doi.org/10.1371/journal.pone.0338120

Editor: Parthiban Kathirvel, SASTRA Deemed University, INDIA

Received: October 6, 2025; Accepted: December 9, 2025; Published: January 7, 2026

Copyright: © 2026 Hoang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting information files.

Funding: This research is funded by the University of Transport and Communications (UTC) [grant number T2025-CT-004TD to V.H.H.].

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

Reinforced concrete (RC) columns are important load-bearing structures in buildings and bridges. After many years of service, many existing RC columns often encounter problems of reduced load-bearing capacity due to material strength loss, design defects, aging, or increased load demands. To extend their service life, rehabilitation techniques have been widely applied. Some common techniques include steel jacketing, fiber-reinforced polymer (FRP) confinement, and concrete encasement. Ultra-high-performance concrete (UHPC) has emerged as a transformative material in structural engineering. UHPC is distinguished by its outstanding compressive strength, tensile ductility, and long-term durability [1,2]. Not only is it used in new construction, but UHPC is also rapidly expanding and being used in reinforcing existing infrastructure. UHPC has proven to be superior to conventional materials in repair works. It increases toughness and improves load-bearing capacity significantly compared to traditional concrete materials. In recent years, UHPC has emerged as a promising retrofitting material [3–6]. When used as an external jacket, UHPC can provide substantial confinement and strength enhancement to existing RC columns [7]. However, despite these advantages, design codes such as ACI 318 and Eurocode 4 do not provide explicit provisions for UHPC-jacketed RC columns, leading to uncertainty in practical design and assessment.

Many studies have focused on studying the axial and combined load responses of UHPC reinforced RC columns, and certain achievements have been made [8,9]. Experimental studies have demonstrated that UHPC cladding can significantly improve both axial load capacity and deformation performance. This is achieved by enhancing the overall material strength and delaying the occurrence of premature failure [10]. However, these studies are scattered across different laboratories using a variety of specimen geometries and material properties, and often lack unified predictive models. Conventional analytical methods, adapted from existing concrete or composite design equations, tend to oversimplify the limiting mechanism and provide limited accuracy when applied to UHPC envelopes. In particular, the design frameworks often extend existing limit models for high-strength and conventional concrete. However, such provisions tend to rely on simplified assumptions about stress–strain relationships and uniform confinement. It fails to fully capture the nonlinear and localized behavior observed in UHPC–RC composite systems. This gap underscores the need for more generalizable predictive models that can accommodate both material heterogeneity and complex interaction mechanisms.

Conventional analytical and numerical methods are thus challenged in two respects: (1) insufficient robustness across diverse column geometries and loading conditions, and (2) the inability to fully capture nonlinear interactions between UHPC, reinforcement, and existing concrete. These limitations motivate the need for more reliable, data-driven approaches to predict axial performance and ensure safe, economical design of UHPC-strengthened columns.

Emerging as a versatile tool in the 4.0 technology era, machine learning (ML) has recently developed strongly and been widely applied in civil engineering [11,12]. ML provides effective tools for exploring complex relationships in large, heterogeneous datasets [13]. A sufficiently powerful ML model is capable of accurately predicting outputs with less dependence on simplifying assumptions [12,14,15]. In the problem of determining the load-bearing capacity of UHPC-reinforced concrete columns, ML models are particularly well-suited for identifying hidden patterns between parameters such as shell thickness, reinforcement ratio, fiber volume, and concrete strength – interactions that are difficult to quantify with conventional models. However, to date, only a few studies have explored the use of ML to predict the ultimate load-bearing capacity of UHPC reinforced concrete columns.

Katlav [16] applied ten ML algorithms to predict the moment-carrying capacity of hybrid beams consisting of UHPC–NSC, demonstrating superior accuracy compared to traditional confinement equations. Similarly, to predict the flexural capacity of reinforced UHPC beams, Taffese [17] employed an explainable machine learning model based on CatBoost regression, outperforming traditional mechanical models and six benchmark ensemble methods. Feature analysis identified beam height, longitudinal reinforcement ratio, and beam width as the most influential parameters, with interactions showing that fiber content above 2% amplifies the effect of the reinforcement ratio. These studies demonstrate the potential of explainable ML for accurate and interpretable predictions, but challenges remain due to the “black box” nature of some models and the limited availability of comprehensive datasets [18,19].

The above studies highlight the potential of explainable machine learning models in structural engineering by providing deeper insights into the decision-making mechanisms of complex algorithms. Nevertheless, the limited size of available experimental databases restricts their applicability and poses a significant challenge for reliable predictions. Despite the remarkable mechanical advantages and increasing use of UHPC in strengthening applications, RC columns strengthened with UHPC jackets have received very limited attention in the ML-based prediction domain. The existing empirical and analytical models for UHPC-jacketed columns are mostly derived from simplified assumptions and relatively small experimental datasets, which restrict their ability to account for the nonlinear interactions between geometry, material strength, and confinement effects. Furthermore, most prior ML studies have focused on conventional or FRP-strengthened columns, leaving a gap in understanding the predictive behavior and key influencing factors of UHPC-jacketed RC columns. A careful review of the literature reveals several critical research gaps that motivate the present study:

(1) Lack of ML studies for UHPC-strengthened RC columns: While ML has been widely used for predicting the capacity of conventional or FRP/steel-jacketed RC members, no systematic study has yet addressed RC columns strengthened with UHPC jackets, despite their growing use in retrofitting and rehabilitation
(2) Limited and fragmented experimental data: Existing analytical and empirical models for UHPC-jacketed columns are based on small, scattered experimental datasets, making them inadequate to capture the combined influence of geometry, material properties, and confinement effects
(3) Lack of comparative assessment of advanced ML algorithms: Previous ML studies rarely provide a comprehensive comparison among different state-of-the-art algorithms for this type of structural system
(4) Insufficient interpretability of ML predictions: Most existing ML-based studies focus only on prediction accuracy, with limited effort to interpret the influence of individual input parameters on the output response, which limits the physical understanding and practical use of the models.

The objective of this study is to develop accurate predictive models for estimating the axial load-bearing capacity of reinforced concrete (RC) columns strengthened with UHPC. The workflow of the study is summarized as follows:

Establishment of the first dedicated experimental database for UHPC-jacketed RC columns. The database includes 17 input parameters and one output feature.
Several machine learning models, including Extremely Randomized Trees (ER), K-Nearest Neighbors (KNN), LightBGM, XGBoost, CatBoost, and Cascade Forward Neural Networks (CFNNs), were constructed. The hyperparameters of these models were optimized using a grid search combined with 10-fold cross-validation to ensure robustness.
The predictive performance of the ML models was further assessed using the coefficient of determination (R²), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and root mean square error (RMSE).
SHAP analysis was conducted to interpret the optimal model by examining feature importance and sensitivity, thereby enhancing transparency and providing engineering insights into the influence of input parameters on axial strength prediction.
Critical comparison with existing code-based equations, highlighting their limitations and suggesting directions for incorporating UHPC retrofits into future standards.

2. Methodological background

2.1 Experimental database

This study aims to develop predictive models for estimating the axial load-carrying capacity of reinforced concrete (RC) columns strengthened with ultra-high-performance concrete (UHPC) jackets. Accurate prediction of the capacity of such retrofitted members is of great importance in structural engineering, as it underpins safer and more cost-efficient design solutions. To this end, an extensive literature survey was conducted to compile experimental evidence on UHPC-jacketed RC columns. The database used in this study was compiled from fourteen published experimental studies on reinforced concrete (RC) columns strengthened with ultra-high-performance concrete (UHPC) jackets and tested under axial compression. Each record corresponds to one tested specimen and includes seventeen variables describing geometric, material, and strengthening parameters. To ensure data reliability and consistency, a systematic preprocessing procedure was adopted. All variables were converted into consistent SI units (mm, MPa, and kN), and parameter definitions were standardized across different sources. Data with missing essential information (such as UHPC strength, reinforcement ratio, or load capacity), ambiguous test conditions, or combined loading effects were excluded. Minor secondary parameters were estimated only when justified and clearly documented in the original references. Each data entry was cross-verified with the corresponding tables and figures from the source publication to ensure accuracy, and duplicates from overlapping datasets were removed. After applying these criteria and quality control procedures, a total of 105 complete and reliable specimens were retained, representing the most comprehensive and consistent experimental dataset available for developing and validating the proposed machine learning models. Based on this review, a novel and comprehensive database was systematically established by consolidating information from fourteen published sources [7,9,20–31]. This dataset provides, for the first time, a unified foundation for investigating the behavior of UHPC-retrofitted RC columns. The compiled data were subsequently processed and analyzed using machine learning techniques to ensure reliable capacity prediction. A structured feature engineering procedure was employed to refine raw variables and transform them into suitable input parameters for model training. For predictive modeling, six advanced algorithms widely adopted in structural engineering were utilized, namely ER, KNN, LightGBM, XGBoost, CatBoost, and CFNN. The subsequent sections present details on dataset preparation, feature selection, and the adopted machine learning models.

The collected specimens exhibit variability in several critical parameters, including column dimensions, longitudinal and transverse reinforcement ratios, jacket thickness, compressive strength of both UHPC and the original concrete, yield strength of reinforcing steel, as well as the fiber volume fraction and aspect ratio within the UHPC matrix. A schematic illustration of a typical UHPC-strengthened RC column and its defining parameters is shown in Fig 1, while a detailed summary of the experimental dataset is presented in Table 1.

Download:

Table 1. Statistical information for parameters in the databases.

https://doi.org/10.1371/journal.pone.0338120.t001

Download:

Fig 1. Schematic and related structure of NC rectangular and circular columns strengthened with UHPC.

https://doi.org/10.1371/journal.pone.0338120.g001

In this study, a comprehensive dataset was established based on experimental investigations of RC columns retrofitted with UHPC jackets. Seventeen representative input features were considered as predictors of the ultimate axial load capacity (P_u) of UHPC-strengthened RC columns, comprising one categorical and sixteen numerical variables. The categorical feature was the cross-sectional type (CS), while the numerical features included: column width (b), column length (a) for rectangular sections or diameter (D) for circular sections, column height (h), sectional area of normal concrete (S_NC), compressive strength of normal concrete (f′_c), longitudinal reinforcement ratio (p_t) and transverse reinforcement ratio (p_v) in normal concrete, sectional area of UHPC (S_UHPC), longitudinal reinforcement ratio (p_{t UHPC}) and transverse reinforcement ratio (p_{v UHPC}) in the UHPC jacket, yield strength of longitudinal (f_yt) and transverse (f_yv) reinforcement, UHPC compressive strength (f′_{c UHPC}), UHPC jacket thickness (t _UHPC), and fiber dosage (volume fraction of steel fibers, % fiber) in the UHPC matrix as shown in Figs 3 and 4.

Download:

Fig 2. One-hot encoding applied to the categorical Cross section (CS) feature.

https://doi.org/10.1371/journal.pone.0338120.g002

The dataset includes one categorical feature representing different concrete cross-sectional shapes. To enable its use across various machine learning models, such as ER, LightGBM, XGBoost, and CFNN, one-hot encoding was applied to convert this non-numeric feature into a numerical format. This transformation not only ensures compatibility with a wide range of algorithms but also enhances the interpretability of feature importance. Each category is represented by a separate binary column, where 1 indicates the presence of the category and 0 its absence. Fig 2 illustrates the CS feature before and after applying one-hot encoding.

The target variable in this study is the ultimate load-carrying capacity (P_u), defined as the maximum resistance attained by the strengthened column during experimental testing. Table 1 and Fig 2 present the statistical profiles of the input and output parameters of numerical features, reflecting both their variability and representativeness. The compressive strength of normal concrete (f′_c) generally falls within 22.2–49 MPa, while UHPC compressive strength (f′_c,UHPC) varying from 81.6 MPa to 189.97 MPa. Reinforcing steel exhibits yield strengths between 240 and 1173 MPa, the UHPC jacket thickness ranges from 0 to 50 mm, and fiber volume fractions vary from 0% to 2.3%. This heterogeneity in material and geometric characteristics establishes a solid basis for developing machine learning–based models to predict the axial performance of UHPC-strengthened RC columns.

2.2 Methodology

In this study, several ML algorithms were employed to develop predictive models for estimating the axial load-carrying capacity of RC columns strengthened with UHPC jackets. The selected algorithms represent diverse learning mechanisms, including ensemble-based tree models, instance-based learning, gradient boosting, and neural networks. This diversity enables a comprehensive evaluation of the nonlinear interactions between geometric, material, and reinforcement parameters that affect the axial load-bearing capacity of UHPC-encased RC columns. Extremely Randomized Trees (ER), K-Nearest Neighbors (KNN), Light Gradient Boosting Machine (LightGBM), Extreme Gradient Boosting (XGBoost), CatBoost, and Cascade Forward Neural Network (CFNN) are the six machine learning models that were employed in this study. For nonlinear regression tasks, each algorithm has unique benefits:

(1) ER is an ensemble technique based on bagging that reduces variance and overfitting by creating multiple randomized decision trees.
(2) KNN is an instance-based algorithm that captures local relationships between similar samples.
(3) LightGBM is a gradient boosting model designed for structured data, offering high efficiency and scalability.
(4) XGBoost is a powerful boosting algorithm with strong regularization capabilities that can effectively manage intricate nonlinear dependencies.
(5) CatBoost is an enhanced gradient boosting framework that reduces overfitting and effectively handles categorical variables.
(6) CFNN is a feedforward neural network extension that enhances learning stability and convergence by introducing direct connections from the input to deeper layers.

These models were selected to ensure a broad representation of various learning mechanisms, including gradient boosting, bagging, kernel-based regression, and instance-based learning. Preliminary analyses showed that these algorithms provided a good balance between prediction accuracy, generalization, and computational efficiency for the given dataset. Other models, such as linear regression and artificial neural networks, were not included because their performance was found to be less stable or less interpretable for the relatively small dataset used in this study. The following subsections provide a concise overview of the theoretical background and implementation of each algorithm.

2.2.1 Extremely Randomized Trees (ER).

The Extremely Randomized Trees (ER) algorithm is an ensemble technique that constructs a large number of decision trees and aggregates their outputs for prediction. However, where feature splits are chosen based on optimal criteria, ER introduces higher randomness by selecting both the features and the cut points at random during tree construction [32]. This strategy accelerates training and increases model diversity, which can enhance generalization performance. For regression problems, predictions are obtained by averaging the outputs of all trees, while classification relies on majority voting. Owing to its simplicity and resistance to overfitting, ER has been widely applied in various predictive modeling tasks.

2.2.2 K-Nearest Neighbors (KNN).

The K-Nearest Neighbors (KNN) algorithm is a fundamental machine learning method commonly applied in both classification and regression tasks [33,34]. KNN works on approximation: the similarity between the query instance and existing data points is measured, often using the Euclidean distance. The algorithm then identifies the k closest neighbors from the training set and infers the output by aggregating their values – through majority voting in classification or averaging in regression. Its simplicity and intuitive design make KNN a widely recognized baseline model in predictive analytics.

2.2.3 Extreme Gradient Boosting (XGBoost).

Extreme Gradient Boosting (XGBoost) is an optimized implementation of the gradient boosting framework. XGBoost is specifically designed for speed and performance. It builds predictive models by sequentially adding weak learners, typically decision trees. Each new learner focuses on correcting the residuals of the previous ones. XGBoost integrates advanced regularization techniques, such as L1 and L2 penalties, help control overfitting [35]. It has been extensively adopted for regression and classification tasks in both academia and industry.

XGBoost is suitable for the problem of predicting the axial resistance of UHPC reinforced RC columns because the relationship between input factors (concrete strength, core area, UHPC jacket area, reinforcement ratio, jacket thickness) and ultimate load Pu is strongly nonlinear and has complex interactions. Traditional methods often have difficulty capturing multiple relationships simultaneously. XGBoost is able to learn high-order interactions from data, control complexity through regularization mechanism, and make good use of second-order gradient information to make more stable and accurate predictions. The model not only predicts Pu effectively but also maintains generality when applied to diverse column configurations.

2.2.4 CatBoost.

CatBoost [36] is a gradient boosting algorithm that was developed with a particular emphasis on handling categorical variables efficiently. CatBoost applies an innovative technique to encode categorical features and prevents prediction shift during training. It also incorporates ordered boosting and symmetric tree structures, leading to enhanced generalization capability and reduced overfitting. CatBoost has shown competitive performance across various regression and classification problems, especially when datasets contain a large proportion of categorical attributes.

In predicting the axial resistance of UHPC reinforced RC columns, CatBoost is useful because the data may contain many complexes, non-linear interacting features (between core area, UHPC jacket area, material strength, and reinforcement ratio). CatBoost can capture these nonlinear relationships well, while limiting prediction errors and maintaining high generalization ability. This explains why CatBoost often outperforms other boosting models in many applied studies.

2.2.5 Light Gradient Boosting Machine (LightGBM).

LightGBM (Light Gradient Boosting Machine) [37] is an efficient and widely used supervised learning algorithm belonging to the ensemble family. LightGBM builds a series of decision trees sequentially, where each new tree is trained to minimize the residual errors of the existing ensemble. The contribution of each tree is scaled by a learning rate to prevent overfitting. This additive process, guided by gradient-based optimization of a specified loss function, progressively enhances predictive accuracy and enables the model to capture highly nonlinear relationships among variables. LightGBM is flexible, fast, and memory-efficient, making it suitable for regression, classification, and forecasting tasks across various domains.

For predicting the axial resistance of UHPC reinforced RC columns, LightGBM is particularly appropriate because the relationship between material properties, geometric parameters, and reinforcement factors with the ultimate load Pu is inherently nonlinear. Through its mechanism of sequentially learning from errors and refining predictions across many trees, LightGBM effectively models interactions among variables. Although other advanced gradient boosting variants such as XGBoost or CatBoost may offer additional optimizations, LightGBM remains a reliable, high-performance method, often employed as a benchmark in machine learning studies within structural engineering research.

2.2.6 Cascade Forward Neural Networks (CFNN).

Cascade Forward Neural Networks (CFNN) are a variant of multilayer perceptron architectures that extend the conventional feedforward network by introducing additional forward connections [38]. Unlike traditional backpropagation neural networks (BPNNs), where each hidden layer only receives input from the previous layer, CFNNs allow each layer to receive inputs not only from the preceding layer but also directly from the original input layer. This cascaded connection pattern enables the network to propagate raw input information throughout the entire architecture, enhancing its ability to capture both low-level and high-level feature interactions simultaneously.

The structural design of CFNNs often leads to faster learning and improved approximation capabilities, especially when dealing with highly nonlinear and complex datasets. By facilitating richer information flow across layers, CFNNs can achieve more accurate regression and prediction results with fewer hidden units compared to standard MLPs. Beyond theoretical advantages, CFNNs have been successfully applied in engineering and scientific problems, where they demonstrated robust predictive performance in scenarios requiring precise modeling of nonlinear relationships. Recent findings by Nguyen [39] revealed that the CFNN model outperformed both LightGBM and SVR in predicting the compressive strength of geopolymer concrete, particularly when data augmentation techniques were employed.

2.2.7 Shapley additive explanation (SHAP).

Machine learning (ML) algorithms are often referred to as “black-box” models, since their internal decision-making processes are not directly visible. For this reason, enhancing model interpretability is crucial to ensure transparency, trustworthiness, and opportunities for improvement. Among the various tools used for this purpose, feature importance techniques are widely applied to evaluate the contribution of individual input variables to the prediction outcome, thereby assisting in the interpretation of ML models.

In recent years, the Shapley Additive Explanations (SHAP) framework has become one of the most widely adopted methods for explaining model behavior. SHAP creates a link between predictive accuracy and interpretability by assigning a quantitative importance score to each feature based on cooperative game theory. This enables the transformation of an opaque ML model into a more transparent, interpretable system. In the present study, SHAP is utilized to examine the most accurate predictive model and to assess the effect of each input variable on the predicted outcomes. This technique is selected because it offers both global perspectives, revealing overall feature importance across the dataset, and local insights, illustrating how features contribute to individual predictions, making it highly effective and applicable for the top-performing machine learning algorithms.

3. Model implementation and evaluation

3.1 Performance criteria

To measure the accuracy and efficiency of the machine learning (ML) models developed in this work, four performance indicators are employed: the coefficient of determination (R²), root mean square error (RMSE), mean absolute percentage error (MAPE), mean absolute error (MAE). Among these, R², RMSE, MAPE, and MAE are the most commonly used metrics in regression-based research within structural engineering. A higher R² value, approaching unity, indicates stronger predictive performance. Conversely, lower values of RMSE, MAPE, and MAE reflect better model precision. The mathematical expressions of these performance measures are presented as follows:

(1)

(2)

(3)

(4)

The coefficient of determination (R²) measures how well the predicted values explain the variance in the observed data, reflecting overall goodness-of-fit. Mean absolute error (MAE) quantifies the average magnitude of prediction errors without considering their direction, providing a straightforward measure of accuracy. Mean absolute percentage error (MAPE) expresses errors as a percentage of actual values, allowing for comparison across different scales. Root mean square error (RMSE) emphasizes larger errors by squaring the deviations before averaging, offering sensitivity to outliers. By collectively analyzing these metrics, researchers can comprehensively assess predictive precision, consistency, and robustness, enabling the selection of the most reliable algorithm for a given problem.

3.2 Model training and test procedure

Fig 5 presents the workflow for constructing machine learning models aimed at predicting the load-bearing capacity of reinforced concrete (RC) columns strengthened with UHPC. The process involves several critical stages: assembling the experimental dataset, performing data preprocessing and refinement, training the models, optimizing hyperparameters, validating performance, and interpreting the results. Each step plays a vital role in ensuring the predictive system is both accurate and dependable.

Download:

Fig 3. Distribution of 16 numerical input and 01 numerical output features.

https://doi.org/10.1371/journal.pone.0338120.g003

Download:

Fig 4. Correlation matrix for the numerical input parameters of the dataset.

https://doi.org/10.1371/journal.pone.0338120.g004

Download:

Fig 5. Model training and test procedure.

https://doi.org/10.1371/journal.pone.0338120.g005

The dataset, compiled from prior experimental investigations on the axial capacity of UHPC-strengthened RC columns, forms the basis for model training. The data is split into training and testing subsets, with 80% used for model development and 20% held out for testing and evaluation. During training, algorithms such as ER, KNN, LightGBM, XGBoost, CatBoost, and CFNN are implemented in Python and optimized using Grid Search in conjunction with 10-fold cross-validation, allowing systematic tuning of hyperparameters while maintaining robust generalization.

Performance is measured with multiple statistical metrics, including R², RMSE, MAE, and MAPE, to comprehensively assess predictive accuracy and reliability. The algorithm demonstrating the best performance is then analyzed using SHAP values to determine the contribution of each input variable to the predicted axial load. Finally, predicted outcomes are compared directly with experimental results, providing both interpretability and validation of the machine learning framework.

3.3 Evaluation of model

The optimal hyperparameters obtained for each model are summarized in Table 2, which presents the tuned configurations that yielded the most favorable performance during the Grid Search process. These parameter settings reflect the specific characteristics of each algorithm and highlight the importance of hyperparameter optimization in improving predictive accuracy. Following the optimization stage, the predictive performance of all models is evaluated and benchmarked using four statistical indicators: the coefficient of determination (R²), mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). The comparative results, reported in Table 3, provide a comprehensive overview of how each algorithm performs in terms of accuracy, consistency, and generalization capability. This dual presentation of hyperparameter tuning (Table 2) and performance outcomes (Table 3) ensures transparency and allows a fair comparison across all models.

Download:

Table 2. Optimal hyperparameters.

https://doi.org/10.1371/journal.pone.0338120.t002

Download:

Table 3. Average predictive performance of the model obtained through K-fold cross-validation.

https://doi.org/10.1371/journal.pone.0338120.t003

The comparative evaluation of the employed machine learning models provides important insights into their ability to predict the axial capacity of UHPC-jacketed RC columns. The analysis was performed using four widely accepted error indicators, namely the coefficient of determination (R²), mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). These indices were calculated separately for the training dataset, the unseen test dataset, and the overall dataset in order to provide a more balanced and transparent performance assessment (Fig 6).

Download:

Fig 6. Performance comparison of six models in terms of R², MAE, MAPE, and RMSE.

https://doi.org/10.1371/journal.pone.0338120.g006

On the training set, all six models exhibited outstanding fitting ability, with coefficients of determination (R²) between 0.974 and 0.994, confirming that the selected predictors captured the majority of the variance in axial load. ER, KNN, CatBoost, and CFNN all achieved R² = 0.994, with very small errors (ER model: MAE = 50.31, RMSE = 113.05; KNN model: MAE = 43.30, RMSE = 112.10; CatBoost model: MAE = 66.32, RMSE = 118.44; CFNN model: MAE = 56.49, RMSE = 120.55). LightGBM and XGBoost also performed strongly (R² = 0.974 and 0.988, respectively), though with comparatively larger errors.

On the independent test set, model discrepancies became more evident. CatBoost model provided the best generalization, with R² = 0.983, MAE = 177.18, and RMSE = 210.66, outperforming all other approaches. ER (R² = 0.977) and KNN (R² = 0.974) also yielded competitive accuracy but with higher error levels (RMSE ≈ 248–262). LightGBM and XGBoost suffered more substantial error propagation (R² = 0.929–0.945; RMSE = 384–435), highlighting their sensitivity to overfitting despite good training performance. CFNN (R² = 0.972, RMSE = 275.66) achieved acceptable results but remained less stable than CatBoost.

When evaluated across the entire dataset, CatBoost again demonstrated the most balanced outcome (R² = 0.99, RMSE = 141.76), followed by ER (R² = 0.99, RMSE = 150.22) and KNN (R² = 0.99, RMSE = 154.37). LightGBM and XGBoost achieved slightly lower accuracy (R² = 0.97, RMSE = 241–276), whereas CFNN produced higher errors (RMSE = 163.78), confirming its relative inferiority.

In summary, CatBoost stands out as the most robust and accurate predictor for the axial capacity of UHPC-jacketed RC columns, combining strong generalization and low error. ER and KNN also delivered reliable results, while gradient boosting models showed potential but required further refinement to reduce error levels. CFNN, although effective in capturing nonlinear patterns, was consistently less competitive than ensemble-based methods

3.4 ML models prediction performance of Catboost

Fig 7 illustrates the comparison between the ultimate load-carrying capacity (P_u) predicted by the CatBoost model and the experimentally measured values for train data and test data. In the plot, the x-axis corresponds to the actual P_u, while the y-axis shows the predicted values. Ideally, perfect predictions would lie exactly along the diagonal line y = x. To visualize deviations from perfect prediction, shaded bands representing ±10% and ±20% of the actual values are included. It can be seen that most predicted points cluster close to the diagonal, with the majority falling within the ± 10% range, indicating strong agreement between predicted and observed results.

Download:

Fig 7. Relationship between actual and predicted values with the Catboost model.

https://doi.org/10.1371/journal.pone.0338120.g007

As shown in the Fig 8, the horizontal axis represents the sample index, while the vertical axis shows the actual and predicted Pu values for test data. The plot exhibits a very close overlap between the predicted and actual values, indicating that the model accurately captures the underlying trend of the data.

Download:

Fig 8. Comparison of actual and predicted P_u for the 20% data testing using the Catboost Model.

https://doi.org/10.1371/journal.pone.0338120.g008

The key performance metrics: R² = 0.983, MAE = 177.18, MAPE = 19.48 and RMSE = 210.66, which confirm the high accuracy of the predictions. Most of the predicted points lie very close to the actual points, and deviations from the actual values are minimal across the tested samples. Only a few points show larger deviations, which occur at peaks of the Pu values, but these are rare and do not significantly affect overall model performance.

Although the CatBoost model achieved a very high coefficient of determination (R² = 0.983), additional validation analyses were performed to verify that this performance did not result from overfitting. A 10-fold cross-validation confirmed the model’s stability, with minimal variation in performance metrics across folds. Permutation feature importance and SHAP-based sensitivity analyses were carried out to examine the robustness and physical plausibility of the model predictions. The CatBoost model maintained stable accuracy under random feature perturbations, and the parameter influence trends agreed well with mechanical expectations, confirming that the model captures genuine structural relationships rather than memorizing the data.

Fig 8 clearly illustrates that the CatBoost model provides highly reliable predictions with low error, making it the most effective algorithm among the tested models for predicting the compressive load of UHPC columns. The high R² value combined with low MAE, MAPE, and RMSE supports the conclusion of strong predictive capability and robust generalization to unseen samples.

3.5 Comparison with existing calculation methods

At present, no official design code exists for evaluating the load-bearing capacity of reinforced concrete columns strengthened with ultra-high-performance concrete (UHPC). In this section, several international standards are reviewed, and their calculated results are compared with those obtained in Section 3.4. According to the general perspective of these codes, the ultimate axial load capacity, P_u, of a UHPC-strengthened composite column can be determined by summing the individual contributions of each constituent within the cross-section. This relationship can be expressed in the following form:

P_NC, P_UHPC and P_s denote the ultimate load capacities contributed by the core concrete, the UHPC jacket, and the longitudinal steel reinforcement, respectively. While the fundamental assumptions of the calculation approaches are generally consistent, significant variations exist in how each standard treats the contributions of the different components within a composite column. The following section reviews the procedures specified in ACI 318 [40] and Eurocode 2 [41], and evaluates their applicability to UHPC-encased reinforced concretes columns.

3.5.1 ACI318 approach.

Within the framework of the plastic stress distribution method, the axial resistance of a UHPC-encased composite concrete column with a square cross-section is evaluated under certain simplifying assumptions regarding material behavior. Specifically, it is postulated that the longitudinal reinforcing bars have already reached their yield strength in compression at the ultimate limit state. In contrast, both the UHPC encasement and the normal-strength concrete (NSC) core are considered not to fully mobilize their compressive capacities. Instead, consistent with the reduction factors adopted in modern design standards, only 85% of their characteristic compressive strengths are taken into account. This reduction factor reflects the influence of material variability, stress non-uniformity across the section, and long-term effects such as creep and microcracking, which may prevent the section from achieving its theoretical maximum strength in practice.

On this basis, the ACI 318 standard [40] provides a design-oriented expression to quantify the axial load-bearing capacity of such columns. The formulation integrates the contributions of each component—the steel reinforcement, the UHPC jacket, and the inner concrete core—into a unified model. By summing the reduced strengths of the composite materials along with the full contribution of the yielded reinforcement, the resulting equation provides a rational and conservative estimate of the ultimate axial load capacity of UHPC-strengthened reinforced concrete columns:

Where f’_c, f’_{c UHPC} and f’_yt are the compressive strength of the core concrete and UHPC and yielding strength of longitudinal bars; and S _NC, S _UHPC and A_s are the cross-sectional areas of UHPC encasement, core concrete, and longitudinal rebars.

3.5.2 EC2 approach.

The Eurocode 2 (EC2) [41] provides a systematic framework for assessing the ultimate axial capacity of reinforced concrete columns, including those strengthened with UHPC jackets. Unlike ACI, EC2 explicitly incorporates the role of partial safety factors applied to both steel reinforcement and concrete, thereby reducing the nominal material strengths to design strengths. This approach accounts for uncertainties associated with material properties, construction quality, and structural analysis, ensuring a consistent level of safety and reliability across European practice.

For UHPC-encased columns with square cross-sections, EC2 assumes that the longitudinal reinforcing bars reach their design yield strength under compression, while the UHPC jacket and normal-strength concrete core are considered at their respective design compressive strengths, each modified by the appropriate partial safety factor (γ). This methodology generally produces design capacities that may differ from those obtained using ACI, depending on the values specified in the National Annex. By integrating the beneficial effects of UHPC confinement within its reliability-based design framework, Eurocode 2 provides a balanced and rational basis for evaluating the axial load-bearing capacity of strengthened reinforced concrete columns. Accordingly, the design axial resistance in this study is determined following the EC2 formulation (Fig 9):

Download:

Fig 9. Performance of EC2 and ACI 318 to predict P_u.

https://doi.org/10.1371/journal.pone.0338120.g009

The predicted load-bearing capacity of reinforced concrete columns strengthened with ultra-high-performance concrete (UHPC) varies considerably depending on the design standard. As shown in Table 4, the EC2 method yields an R² of 0.635, MAE of 648.972, MAPE of 37.55%, and RMSE of 924.358, indicating limited predictive accuracy. ACI 318 improve the predictions (R² = 0.849), but still exhibit considerable errors.

Download:

Table 4. Performance of model with existing calculation methods.

https://doi.org/10.1371/journal.pone.0338120.t004

In comparison with the results in Table 3, this clearly illustrates that advanced machine learning algorithms can capture the complex, nonlinear behavior of UHPC-strengthened reinforced concrete columns more effectively than conventional code-based approaches. The superior accuracy of CatBoost arises from its ordered boosting, capability to handle categorical variables, and robust regularization, making it a highly reliable tool for predicting column load-bearing capacity.

The ML-predicted axial capacities were compared with those obtained from EC2 and ACI 318 design provisions. As expected, the design codes produced more conservative estimates due to the inclusion of global safety factors and simplified confinement models. In contrast, the ML models directly predicted the experimentally measured ultimate strengths, which correspond to mean structural capacities.

As a result, even though the ML predictions seem higher than the code-based values, this discrepancy actually indicates the lack of embedded safety margins rather than unsafe performance. For real-world applications, suitable safety or reduction factors (e.g., φ = 0.75–0.85) can be added to match design-level values with ML-predicted strengths. Such calibration could enable future integration of data-driven approaches into code-based design frameworks. The results thus highlight the potential of ML not as a replacement for current codes, but as a complementary predictive tool for refining design provisions and improving reliability assessment of UHPC-strengthened RC columns.

4. Model explain ability

4.1 SHAP-based analysis

The feature importance analysis (Fig 10) highlights the parameters most strongly influencing the ultimate axial load capacity (P_u) of UHPC- UHPC-strengthened RC columns. The sectional areas of the UHPC jacket (S_UHPC) and sectional area of normal concrete (S_NC) are identified as the most dominant features, contributing 17.24% and 15.63% of the total importance, respectively. This finding is consistent with column confinement theory, as these two parameters directly determine the load-bearing capacity of the composite cross-section. A larger UHPC jacket area and concrete core area provide enhanced confinement and greater axial stiffness, leading to a substantial increase in the ultimate axial load capacity.

Download:

Fig 10. Relative importance of input features for UHPC-confined RC columns.

https://doi.org/10.1371/journal.pone.0338120.g010

The compressive strength of normal concrete (f_c’) ranks third (11.25%), followed by UHPC jacket thickness (t _UHPC) with 8.58%. Both features play vital roles in improving the overall confinement and stiffness of the strengthened column. The diameter (D) and column height (h) also exhibit notable contributions (6.96% and 5.76%), reflecting their influence on the slenderness ratio and global stability under axial compression.

The UHPC compressive strength (f’_{c UHPC}) and transverse reinforcement ratio in the UHPC layer (p_v in UHPC) contribute moderately (5.52% and 5.42%), underscoring the importance of material strength and lateral confinement in enhancing ductility and preventing premature failure. Yield strength of transverse reinforcement (f_yv) and fiber content (% fiber) also show measurable effects, indicating that higher reinforcement strength and appropriate fiber dosage improve the ultimate axial load capacity.

Parameters such as the longitudinal reinforcement ratio (p_t) and transverse reinforcement ratio (p_v) in normal concrete, as well as the column length (a), columns width (b), and longitudinal reinforcement ratio in UHPC (p_f in UHPC), show lower relative importance (below 4%), suggesting their effects are secondary or act synergistically with other key parameters.

Overall, the analysis demonstrates that the axial capacity of UHPC-confined RC columns is primarily governed by the sectional areas and compressive strengths of both the core and the UHPC jacket, followed by geometric parameters and reinforcement detailing. These insights confirm the strong alignment between data-driven findings and structural confinement mechanisms, reinforcing the physical interpretability.

4.2 Feature dependency analysis

The SHAP feature dependency analysis provides further insights into how variations in individual parameters affect the predicted axial capacity of UHPC-jacketed RC columns. Fig 11 illustrates the SHAP values for each feature, with the color scale indicating relative feature magnitudes (red = higher values, blue = lower values). Consistent with the feature importance ranking, the cross-sectional areas of UHPC and normal concrete (S_UHPC, S_NC) exert the most significant influence. Larger section sizes (red points) are associated with strongly positive SHAP values, confirming their direct contribution to axial strength by increasing the effective load-bearing area.

Download:

Fig 11. SHAP dependency plot for UHPC-confined RC columns.

https://doi.org/10.1371/journal.pone.0338120.g011

Material properties also show clear trends. Higher compressive strength of normal concrete (f’c) and greater UHPC thickness (t _UHPC) correspond to positive SHAP contributions, emphasizing the dual role of core quality and jacket confinement in strength enhancement. Similarly, increases in UHPC compressive strength (f’_{c UHPC}) and reinforcement ratios (p_v% in UHPC and NC) generally improve predictions, though some scatter suggests interaction effects with other geometric parameters.

Reinforcement-related factors, such as yield strength of longitudinal bars (f_yt) and transverse reinforcement (f_yv), exhibit moderate but consistent positive impacts, indicating that higher steel strength enhances confinement and load resistance. In contrast, fiber volume fraction (% fiber) shows a more dispersed pattern, with both positive and negative SHAP values, implying that its effect depends on dosage—moderate fiber contents contribute positively by improving crack control, while excessive amounts may reduce workability and efficiency.

Geometric factors such as column diameter (D), height (h), and column length (a) also play secondary but relevant roles, with SHAP values reflecting their influence on load transfer mechanisms. Meanwhile, parameters like p_{t (%)} in UHPC or column width (b) demonstrate relatively limited influence, aligning with their lower overall importance.

The SHAP analysis revealed that the UHPC jacket compressive strength and jacket thickness have the strongest positive influence on the predicted axial load capacity. This is consistent with classical confinement theory, where a thicker and stronger jacket provides higher lateral pressure, enhancing the confined core concrete strength. The positive contributions of the core concrete strength and longitudinal reinforcement ratio also align with analytical confinement models, confirming that the ML framework correctly learns the combined effects of material strength and reinforcement on load resistance. Conversely, geometric parameters such as slenderness ratio exhibit negative SHAP values, reflecting the reduction in stability and axial strength observed in structural mechanics. These physically interpretable trends confirm that the proposed ML models, particularly CatBoost, not only provide accurate predictions but also capture the governing mechanical behaviors of UHPC-confined RC columns.

Overall, the SHAP dependency plots confirm that structural capacity is most strongly governed by cross-sectional dimensions and material strength, while reinforcement and fiber characteristics provide additional but more variable contributions. These findings are consistent with engineering mechanics, where geometry and material quality dominate axial resistance, and detailing factors modulate the overall response.

4.3 ICE and PDP

The ICE (Individual Conditional Expectation) and PDP (Partial Dependence Plot) analyses (Fig 12) further validate the interpretability of the CatBoost model by illustrating the nonlinear and, in several cases, quasi-monotonic relationships between the key predictors and the predicted axial load capacity (P_u). Specifically, increases in the cross-sectional reinforcement areas (S_NC and S_UHPC) consistently lead to higher predicted capacities, confirming their dominant structural contribution. Similarly, enhancements in UHPC compressive strength (f’_{c UHPC}) produce incremental gains, though the overall sensitivity remains moderate. In contrast, parameters such as fiber volume fraction (% fiber) and UHPC cover thickness (t_UHPC) exhibit relatively minor influence, indicating that their effects are secondary and may depend on interaction with other design variables. Overall, these findings suggest that the reinforcement configuration and column geometry serve as the governing determinants of load capacity, while UHPC primarily functions as a supplementary strengthening layer that improves stiffness and stress distribution efficiency rather than directly governing axial resistance.

Download:

Fig 12. ICE (Individual Conditional Expectation) and PDP – (Partial Dependence Plot) analyses.

https://doi.org/10.1371/journal.pone.0338120.g012

5. Limitation and future recommendations

One important limitation of this study concerns the scope and diversity of the dataset employed. The current database does not fully capture the wide variation in column geometries, reinforcement layouts, or concrete strength grades. As outlined in Section 2, because only a small number of column tests are currently available, the investigation was restricted to UHPC–RC composite columns, which narrows the applicability of the model to this structural type. In addition, the distribution of input parameters, shown in Fig 2, is uneven, leaving some ranges underrepresented. This imbalance can reduce predictive reliability when the model is applied to column cases that deviate from the observed dataset.

Overcoming these issues will require extending the dataset to cover a broader set of structural scenarios for UHPC–RC columns. Although incorporating results from previous experimental work, as done in this research, helps to enrich the data, it is both labor-intensive and vulnerable to inconsistencies across testing protocols and reporting practices. A promising solution would be to establish a shared, open-access repository dedicated to UHPC–RC composite column research, enabling standardized data exchange and collaboration across the community. Such a resource would support faster data accumulation and improve the generalizability of predictive models. Additionally, data augmentation methods—such as synthetic data generation, advanced interpolation, or generative modeling—offer a means of filling gaps in underrepresented ranges. Future studies could also integrate metaheuristic optimization techniques—such as Particle Swarm Optimization (PSO) [42], Grey Wolf Optimizer [43], …—to fine-tune model hyperparameters, enhance predictive accuracy, and better capture complex nonlinear behaviors in UHPC–RC columns.

The machine learning models developed in this study were trained exclusively on experimental data of RC columns strengthened with UHPC jackets. As such, their predictive validity is limited to similar structural configurations. Although the input parameters span a wide range of geometric and material properties, applying the model to other strengthening techniques would require retraining with relevant datasets. However, the proposed ML framework is flexible and can be easily extended to such cases once adequate data becomes available. The same feature engineering and interpretability procedure can be utilized to develop specialized predictive tools for various strengthening systems and geometries.

6. Conclusions

This study comprehensively evaluated the predictive performance of several advanced machine learning (ML) models for estimating the ultimate axial load capacity (Pu) of reinforced concrete (RC) columns strengthened with ultra-high-performance concrete (UHPC) jackets. Six ML models — Extremely Randomized Trees (ER), k-Nearest Neighbors (KNN), Light Gradient Boosting Machine (LightGBM), eXtreme Gradient Boosting (XGBoost), CatBoost, and Cascade Forward Neural Network (CFNN) — were developed and validated using an experimental database comprising 105 test results. The findings highlight the capability of data-driven methods to accurately capture the complex nonlinear interactions among geometric, material, and reinforcement parameters governing the axial behavior of UHPC–strengthened columns.

Among the evaluated models, CatBoost demonstrated the best overall performance, achieving an R² of 0.983 on the test set and an overall R² of 0.99 with a remarkably low RMSE of 141.76 kN, MAE of 88.49 kN, and MAPE of 8.06%. This reflects its strong generalization and robustness in modeling nonlinear relationships. The ER and KNN models followed closely with comparable R² values of 0.99, though their prediction errors were slightly higher (RMSE from 150 to 154 kN). On the other hand, LightGBM and XGBoost achieved satisfactory performance on the training data (R² = 0.97–0.99) but exhibited more noticeable drops in testing accuracy (R² = 0.929–0.945, RMSE = 384–435 kN), indicating mild overfitting. The CFNN model, while achieving a respectable R² of 0.972 on the test set, produced higher RMSE values (275.66 kN), suggesting a relatively weaker ability to generalize compared with ensemble-based methods.

The SHAP-based feature interpretation of the CatBoost model provided valuable insights into the governing parameters affecting axial capacity. The results revealed that the cross-sectional areas of normal concrete (NC) and UHPC, along with the compressive strength of NC, were the most influential predictors. Secondary factors such as reinforcement ratio, stirrup strength, and fiber content contributed primarily to ductility and confinement effects rather than peak strength. These findings align with established structural mechanics principles, reinforcing the interpretability and reliability of the ML-based framework.

When compared with traditional design, the superiority of the ML approach becomes even more pronounced. The EC2 and ACI 318 equations achieved R² values of 0.635 and 0.849, respectively, with significantly higher RMSE values (924.36 kN and 594.88 kN). In contrast, the CatBoost model reduced the RMSE to just 141.76 kN, representing a reduction in prediction error of approximately 75–85% relative to conventional design methods. This substantial improvement highlights the limitations of current code-based formulations, which rely on simplified assumptions, and underscores the potential of ML models to provide more accurate and generalizable predictions for UHPC–strengthened RC columns.

Beyond predictive performance, the proposed ML framework offers a data-driven foundation for future code calibration. The model can determine the most important factors influencing load-carrying capacity by examining parameter interactions and feature importance. This information can then be used to improve empirical coefficients or partial safety factors in upcoming updates of ACI and EC2 provisions. Additionally, by emphasizing parameter ranges or combinations that have the biggest impact on structural behavior, the interpretability results can direct focused experimental programs, increasing experimental efficiency.

This study demonstrates that ensemble learning models, particularly CatBoost, offer a reliable and interpretable framework for predicting the axial capacity of UHPC-confined RC columns under limited experimental data conditions. The integration of SHAP analysis further enhances the transparency of the predictions, enabling deeper physical understanding of parameter influence. By combining data-driven intelligence with fundamental mechanics, this approach provides both a scientific foundation and practical guidance for designing and optimizing UHPC strengthening systems. Future work should focus on expanding the experimental database, incorporating additional loading scenarios, and exploring hybrid ML–mechanics-based models to further enhance the robustness and applicability of the proposed framework.

Supporting information

S1 Data. SI data 1.

https://doi.org/10.1371/journal.pone.0338120.s001

(CSV)

References

1. Graybeal B. Material property characterization of ultra-high performance concrete. FHWA-HRT-06-103; 2006. p. 1–176.
2. Brühwiler E, Denarié E. Rehabilitation and strengthening of concrete structures using ultra-high performance fibre reinforced concrete. Struct Eng Int. 2013;23(4):450–7.
- View Article
- Google Scholar
3. Hoang VH. Experimental study on flexual behavior of reinforced concrete slabs strengthened ultra-high-performance-concrete. Transp Commun Sci J. 202374(9):1100–9.
- View Article
- Google Scholar
4. Hoang VH. Experimental and modeling of tensile behavior of ultra high performance concrete. Transp Commun Sci J. 2023;74(6):709–17.
- View Article
- Google Scholar
5. Hoang VH, Do TA, Tran AT, Nguyen XH. Flexural capacity of reinforced concrete slabs retrofitted with ultra-high-performance concrete and fiber-reinforced polymer. Innov Infrastruct Solut. 2024;9(4):113.
- View Article
- Google Scholar
6. Le BA, Hoang VH. Flexual capacity prediction of hybrid structure consisting of UHPC-NSC using symbolic regression models. Transp Commun Sci J. 2024;75(5):1870–81.
- View Article
- Google Scholar
7. Helles ZH. Strengthening of square reinforced concrete columns with fibrous ultra high performance self-compacting concrete jacketing. The Islamic University Gaza; 2014.
8. Lu C, Ouyang K, Guo C, Wang Q, Chen H, Zhu W. Axial compressive performance of RC columns strengthened with prestressed CFRP fabric combined with UHPC jacket. Eng Struct. 2023;275:115113.
- View Article
- Google Scholar
9. Chen J, Wang Z, Xu A, Zhou J. Compressive behavior of corroded RC columns strengthened with ultra-high performance jacket. Front Mater. 2022;9.
- View Article
- Google Scholar
10. Braveru CS, Zhou W. Experimental study of the axial compressive behavior of square cross-section UHPC-encased composite concrete columns. J Build Eng. 2025;106:112606.
- View Article
- Google Scholar
11. de-Prado-Gil J, Palencia C, Jagadesh P, Martínez-García R. A comparison of machine learning tools that model the splitting tensile strength of self-compacting recycled aggregate concrete. Materials (Basel). 2022;15(12):4164. pmid:35744223
- View Article
- PubMed/NCBI
- Google Scholar
12. de-Prado-Gil J, Martínez-García R, Jagadesh P, Juan-Valdés A, Gónzalez-Alonso M-I, Palencia C. To determine the compressive strength of self-compacting recycled aggregate concrete using artificial neural network (ANN). Ain Shams Eng J. 2024;15(2):102548.
- View Article
- Google Scholar
13. de-Prado-Gil J, Palencia C, Jagadesh P, Martínez-García R. A study on the prediction of compressive strength of self-compacting recycled aggregate concrete utilizing novel computational approaches. Materials (Basel). 2022;15(15):5232. pmid:35955167
- View Article
- PubMed/NCBI
- Google Scholar
14. Jagadesh P, Khan AH, Shanmuga Priya B, Asheeka A, Zoubir Z, Magbool HM, et al. Correction: artificial neural network, machine learning modelling of compressive strength of recycled coarse aggregate based self-compacting concrete. PLoS One. 2025;20(4):e0322947. pmid:40261853
- View Article
- PubMed/NCBI
- Google Scholar
15. He Y, Gao S, Li Y, Guan Y, Zhang J, Hu D. Adaptive machine learning framework: Predicting UHPC performance from data to modelling. Res Eng. 2025;27:106724.
- View Article
- Google Scholar
16. Katlav M, Ergen F. Data-driven moment-carrying capacity prediction of hybrid beams consisting of UHPC-NSC using machine learning-based models. Structures. 2024;59:105733.
- View Article
- Google Scholar
17. Taffese WZ, Zhu Y. Explainable machine learning for predicting flexural capacity of reinforced UHPC beams. Eng Struct. 2025;343:121188.
- View Article
- Google Scholar
18. Wakjira TG, Abushanab A, Alam MS. Hybrid machine learning model and predictive equations for compressive stress-strain constitutive modelling of confined ultra-high-performance concrete (UHPC) with normal-strength steel and high-strength steel spirals. Eng Struct. 2024;304:117633.
- View Article
- Google Scholar
19. Safieh H, Hawileh RA, Assad M, Hajjar R, Shaw SK, Abdalla J. Using multiple machine learning models to predict the strength of UHPC mixes with various FA percentages. Infrastructures. 2024;9(6):92.
- View Article
- Google Scholar
20. Enami RM. Reforço de pilares curtos de concreto armado por encamisamento com concreto de ultra-alto desempenho. University of São Paulo; 2017.
21. Farzad M, Rastkar S, Sadeghnejad A, Azizinamini A. Simplified method to estimate the moment capacity of circular columns repaired with UHPC. Infrastructures. 2019;4(3):45.
- View Article
- Google Scholar
22. Ali Dadvar S, Mostofinejad D, Bahmani H. Strengthening of RC columns by ultra-high performance fiber reinforced concrete (UHPFRC) jacketing. Constr Build Mater. 2020;235:117485.
- View Article
- Google Scholar
23. Susilorini RrMIR, Kusumawardaningsih Y. Advanced study of columns confined by ultra-high-performance concrete and ultra-high-performance fiber-reinforced concrete confinements. Fibers. 2023;11(5):44.
- View Article
- Google Scholar
24. Li F, Hexiao Y, Gao H, Deng K, Jiang Y. Axial behavior of reinforced UHPC-NSC composite column under compression. Materials (Basel). 2020;13(13):2905. pmid:32605248
- View Article
- PubMed/NCBI
- Google Scholar
25. Li Y, Du X, Jia J, Ma R, Jia C, Deng H. Experimental studies on axial compressive behavior of reinforced concrete-filled UHPC tube composite column. Structures. 2025;76:108953.
- View Article
- Google Scholar
26. Alamoodi MA, Zahid M, Bakar BHA, Tayeh BA, Zeyad AM. Behavior of damaged reinforced concrete columns retrofitted with ultra-high performance fiber reinforced concrete jackets under uniaxial loading. J Build Eng. 2025;108:112837.
- View Article
- Google Scholar
27. Tian H-W, Ma X-J, Li B, Zhou Z. Experimental and numerical investigation on square concrete-filled UHPC tubular columns under axial compression. Structures. 2024;70:107655.
- View Article
- Google Scholar
28. Yang X, Zhang B, Zhou A, Wei H, Liu T. Axial compressive behaviour of corroded steel reinforced concrete columns retrofitted with a basalt fibre reinforced polymer-ultrahigh performance concrete jacket. Compos Struct. 2023;304:116447.
- View Article
- Google Scholar
29. Le HA, Ho VH, Nguyen P-C, Le TP, Lam MN-T. Axial compressive behavior of circular RC stub columns jacketed by UHPC and UHPFRC. Case Stud Constr Mater. 2024;21:e03761.
- View Article
- Google Scholar
30. Farouk AIB, Rong W, Zhu J. Compressive behavior of ultra-high-performance-normal strength concrete (UHPC-NSC) column with the longitudinal grooved contact surface. J Build Eng. 2023;68:106074.
- View Article
- Google Scholar
31. Shehab H, Eisa A, Wahba AM, Sabol P, Katunský D. Strengthening of reinforced concrete columns using ultra-high performance fiber-reinforced concrete jacket. Buildings. 2023;13(8):2036.
- View Article
- Google Scholar
32. Geurts P, Ernst D, Wehenkel L. Extremely randomized trees. Mach Learn. 2006;63(1):3–42.
- View Article
- Google Scholar
33. Huang S, Huang M, Lyu Y. A novel approach for sand liquefaction prediction via local mean-based pseudo nearest neighbor algorithm and its engineering application. Adv Eng Inform. 2019;41:100918.
- View Article
- Google Scholar
34. Hoang VH, Nguyen NL, Bui TT, Tran NH. A two-stage method for damage detection in Z24 bridge based on k-nearest neighbor and artificial neural network. Period Polytech Civ Eng. 2024;68(3):892–902.
- View Article
- Google Scholar
35. Brownlee J. XGBoost with Python: gradient boosted trees with XGBoost and scikit-learn. Machine Learning Mastery; 2016.
36. Dorogush AV, Ershov V, Gulin A. CatBoost: gradient boosting with categorical features support; 2018.
- View Article
- Google Scholar
37. Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. LightGBM: a highly efficient gradient boosting decision tree. 31st Conference on Neural Information Processing Systems (NIPS 2017); 2017.
38. Pwasong A, Sathasivam S. A new hybrid quadratic regression and cascade forward backpropagation neural network. Neurocomputing. 2016;182:197–209.
- View Article
- Google Scholar
39. Nguyen HAT, Pham DH, Ahn Y. Effect of data augmentation using deep learning on predictive models for geopolymer compressive strength. Appl Sci. 2024;14(9):3601.
- View Article
- Google Scholar
40. Institute AC. ACI 318-08: building code requirements for structural concrete and commentary. Farmington Hills (MI): American Concrete Institute; 2008.
41. EN 1992-1-1: Eurocode 2: design of concrete structures - Part 1: general rules and rules for buildings. Brussels, Belgium; 1992.
42. Nguyen-Ngoc L, Do TA, Hoang VH, Hoang TT, Tran TD. Equivalent convective heat transfer coefficient for boundary conditions in temperature prediction of early-age concrete elements using FD and PSO. KSCE J Civ Eng. 2023;27(6):2546–58.
- View Article
- Google Scholar
43. Shokrnia H, KhodabandehLou A, Hamidi P, Ashrafzadeh F. Prediction of compressive strength of fiber-reinforced concrete containing silica (SiO2) based on metaheuristic optimization algorithms and machine learning techniques. Sci Rep. 2025;15(1):19671. pmid:40467780
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Graybeal B. Material property characterization of ultra-high performance concrete. FHWA-HRT-06-103; 2006. p. 1–176.

[ref2] 2. Brühwiler E, Denarié E. Rehabilitation and strengthening of concrete structures using ultra-high performance fibre reinforced concrete. Struct Eng Int. 2013;23(4):450–7.
View Article
Google Scholar

[3] View Article

[4] Google Scholar

[ref3] 3. Hoang VH. Experimental study on flexual behavior of reinforced concrete slabs strengthened ultra-high-performance-concrete. Transp Commun Sci J. 202374(9):1100–9.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref4] 4. Hoang VH. Experimental and modeling of tensile behavior of ultra high performance concrete. Transp Commun Sci J. 2023;74(6):709–17.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref5] 5. Hoang VH, Do TA, Tran AT, Nguyen XH. Flexural capacity of reinforced concrete slabs retrofitted with ultra-high-performance concrete and fiber-reinforced polymer. Innov Infrastruct Solut. 2024;9(4):113.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref6] 6. Le BA, Hoang VH. Flexual capacity prediction of hybrid structure consisting of UHPC-NSC using symbolic regression models. Transp Commun Sci J. 2024;75(5):1870–81.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref7] 7. Helles ZH. Strengthening of square reinforced concrete columns with fibrous ultra high performance self-compacting concrete jacketing. The Islamic University Gaza; 2014.

[ref8] 8. Lu C, Ouyang K, Guo C, Wang Q, Chen H, Zhu W. Axial compressive performance of RC columns strengthened with prestressed CFRP fabric combined with UHPC jacket. Eng Struct. 2023;275:115113.
View Article
Google Scholar

[19] View Article

[20] Google Scholar

[ref9] 9. Chen J, Wang Z, Xu A, Zhou J. Compressive behavior of corroded RC columns strengthened with ultra-high performance jacket. Front Mater. 2022;9.
View Article
Google Scholar

[22] View Article

[23] Google Scholar

[ref10] 10. Braveru CS, Zhou W. Experimental study of the axial compressive behavior of square cross-section UHPC-encased composite concrete columns. J Build Eng. 2025;106:112606.
View Article
Google Scholar

[25] View Article

[26] Google Scholar

[ref11] 11. de-Prado-Gil J, Palencia C, Jagadesh P, Martínez-García R. A comparison of machine learning tools that model the splitting tensile strength of self-compacting recycled aggregate concrete. Materials (Basel). 2022;15(12):4164. pmid:35744223
View Article
PubMed/NCBI
Google Scholar

[28] View Article

[29] PubMed/NCBI

[30] Google Scholar

[ref12] 12. de-Prado-Gil J, Martínez-García R, Jagadesh P, Juan-Valdés A, Gónzalez-Alonso M-I, Palencia C. To determine the compressive strength of self-compacting recycled aggregate concrete using artificial neural network (ANN). Ain Shams Eng J. 2024;15(2):102548.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref13] 13. de-Prado-Gil J, Palencia C, Jagadesh P, Martínez-García R. A study on the prediction of compressive strength of self-compacting recycled aggregate concrete utilizing novel computational approaches. Materials (Basel). 2022;15(15):5232. pmid:35955167
View Article
PubMed/NCBI
Google Scholar

[35] View Article

[36] PubMed/NCBI

[37] Google Scholar

[ref14] 14. Jagadesh P, Khan AH, Shanmuga Priya B, Asheeka A, Zoubir Z, Magbool HM, et al. Correction: artificial neural network, machine learning modelling of compressive strength of recycled coarse aggregate based self-compacting concrete. PLoS One. 2025;20(4):e0322947. pmid:40261853
View Article
PubMed/NCBI
Google Scholar

[39] View Article

[40] PubMed/NCBI

[41] Google Scholar

[ref15] 15. He Y, Gao S, Li Y, Guan Y, Zhang J, Hu D. Adaptive machine learning framework: Predicting UHPC performance from data to modelling. Res Eng. 2025;27:106724.
View Article
Google Scholar

[43] View Article

[44] Google Scholar

[ref16] 16. Katlav M, Ergen F. Data-driven moment-carrying capacity prediction of hybrid beams consisting of UHPC-NSC using machine learning-based models. Structures. 2024;59:105733.
View Article
Google Scholar

[46] View Article

[47] Google Scholar

[ref17] 17. Taffese WZ, Zhu Y. Explainable machine learning for predicting flexural capacity of reinforced UHPC beams. Eng Struct. 2025;343:121188.
View Article
Google Scholar

[49] View Article

[50] Google Scholar

[ref18] 18. Wakjira TG, Abushanab A, Alam MS. Hybrid machine learning model and predictive equations for compressive stress-strain constitutive modelling of confined ultra-high-performance concrete (UHPC) with normal-strength steel and high-strength steel spirals. Eng Struct. 2024;304:117633.
View Article
Google Scholar

[52] View Article

[53] Google Scholar

[ref19] 19. Safieh H, Hawileh RA, Assad M, Hajjar R, Shaw SK, Abdalla J. Using multiple machine learning models to predict the strength of UHPC mixes with various FA percentages. Infrastructures. 2024;9(6):92.
View Article
Google Scholar

[55] View Article

[56] Google Scholar

[ref20] 20. Enami RM. Reforço de pilares curtos de concreto armado por encamisamento com concreto de ultra-alto desempenho. University of São Paulo; 2017.

[ref21] 21. Farzad M, Rastkar S, Sadeghnejad A, Azizinamini A. Simplified method to estimate the moment capacity of circular columns repaired with UHPC. Infrastructures. 2019;4(3):45.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref22] 22. Ali Dadvar S, Mostofinejad D, Bahmani H. Strengthening of RC columns by ultra-high performance fiber reinforced concrete (UHPFRC) jacketing. Constr Build Mater. 2020;235:117485.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref23] 23. Susilorini RrMIR, Kusumawardaningsih Y. Advanced study of columns confined by ultra-high-performance concrete and ultra-high-performance fiber-reinforced concrete confinements. Fibers. 2023;11(5):44.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref24] 24. Li F, Hexiao Y, Gao H, Deng K, Jiang Y. Axial behavior of reinforced UHPC-NSC composite column under compression. Materials (Basel). 2020;13(13):2905. pmid:32605248
View Article
PubMed/NCBI
Google Scholar

[68] View Article

[69] PubMed/NCBI

[70] Google Scholar

[ref25] 25. Li Y, Du X, Jia J, Ma R, Jia C, Deng H. Experimental studies on axial compressive behavior of reinforced concrete-filled UHPC tube composite column. Structures. 2025;76:108953.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref26] 26. Alamoodi MA, Zahid M, Bakar BHA, Tayeh BA, Zeyad AM. Behavior of damaged reinforced concrete columns retrofitted with ultra-high performance fiber reinforced concrete jackets under uniaxial loading. J Build Eng. 2025;108:112837.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref27] 27. Tian H-W, Ma X-J, Li B, Zhou Z. Experimental and numerical investigation on square concrete-filled UHPC tubular columns under axial compression. Structures. 2024;70:107655.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref28] 28. Yang X, Zhang B, Zhou A, Wei H, Liu T. Axial compressive behaviour of corroded steel reinforced concrete columns retrofitted with a basalt fibre reinforced polymer-ultrahigh performance concrete jacket. Compos Struct. 2023;304:116447.
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref29] 29. Le HA, Ho VH, Nguyen P-C, Le TP, Lam MN-T. Axial compressive behavior of circular RC stub columns jacketed by UHPC and UHPFRC. Case Stud Constr Mater. 2024;21:e03761.
View Article
Google Scholar

[84] View Article

[85] Google Scholar

[ref30] 30. Farouk AIB, Rong W, Zhu J. Compressive behavior of ultra-high-performance-normal strength concrete (UHPC-NSC) column with the longitudinal grooved contact surface. J Build Eng. 2023;68:106074.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref31] 31. Shehab H, Eisa A, Wahba AM, Sabol P, Katunský D. Strengthening of reinforced concrete columns using ultra-high performance fiber-reinforced concrete jacket. Buildings. 2023;13(8):2036.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref32] 32. Geurts P, Ernst D, Wehenkel L. Extremely randomized trees. Mach Learn. 2006;63(1):3–42.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref33] 33. Huang S, Huang M, Lyu Y. A novel approach for sand liquefaction prediction via local mean-based pseudo nearest neighbor algorithm and its engineering application. Adv Eng Inform. 2019;41:100918.
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref34] 34. Hoang VH, Nguyen NL, Bui TT, Tran NH. A two-stage method for damage detection in Z24 bridge based on k-nearest neighbor and artificial neural network. Period Polytech Civ Eng. 2024;68(3):892–902.
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref35] 35. Brownlee J. XGBoost with Python: gradient boosted trees with XGBoost and scikit-learn. Machine Learning Mastery; 2016.

[ref36] 36. Dorogush AV, Ershov V, Gulin A. CatBoost: gradient boosting with categorical features support; 2018.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref37] 37. Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. LightGBM: a highly efficient gradient boosting decision tree. 31st Conference on Neural Information Processing Systems (NIPS 2017); 2017.

[ref38] 38. Pwasong A, Sathasivam S. A new hybrid quadratic regression and cascade forward backpropagation neural network. Neurocomputing. 2016;182:197–209.
View Article
Google Scholar

[107] View Article

[108] Google Scholar

[ref39] 39. Nguyen HAT, Pham DH, Ahn Y. Effect of data augmentation using deep learning on predictive models for geopolymer compressive strength. Appl Sci. 2024;14(9):3601.
View Article
Google Scholar

[110] View Article

[111] Google Scholar

[ref40] 40. Institute AC. ACI 318-08: building code requirements for structural concrete and commentary. Farmington Hills (MI): American Concrete Institute; 2008.

[ref41] 41. EN 1992-1-1: Eurocode 2: design of concrete structures - Part 1: general rules and rules for buildings. Brussels, Belgium; 1992.

[ref42] 42. Nguyen-Ngoc L, Do TA, Hoang VH, Hoang TT, Tran TD. Equivalent convective heat transfer coefficient for boundary conditions in temperature prediction of early-age concrete elements using FD and PSO. KSCE J Civ Eng. 2023;27(6):2546–58.
View Article
Google Scholar

[115] View Article

[116] Google Scholar

[ref43] 43. Shokrnia H, KhodabandehLou A, Hamidi P, Ashrafzadeh F. Prediction of compressive strength of fiber-reinforced concrete containing silica (SiO2) based on metaheuristic optimization algorithms and machine learning techniques. Sci Rep. 2025;15(1):19671. pmid:40467780
View Article
PubMed/NCBI
Google Scholar

[118] View Article

[119] PubMed/NCBI

[120] Google Scholar

Figures

Abstract

1. Introduction

2. Methodological background

2.1 Experimental database

2.2 Methodology

2.2.1 Extremely Randomized Trees (ER).

2.2.2 K-Nearest Neighbors (KNN).

2.2.3 Extreme Gradient Boosting (XGBoost).

2.2.4 CatBoost.

2.2.5 Light Gradient Boosting Machine (LightGBM).

2.2.6 Cascade Forward Neural Networks (CFNN).

2.2.7 Shapley additive explanation (SHAP).

3. Model implementation and evaluation

3.1 Performance criteria

3.2 Model training and test procedure

3.3 Evaluation of model

3.4 ML models prediction performance of Catboost

3.5 Comparison with existing calculation methods

3.5.1 ACI318 approach.

3.5.2 EC2 approach.

4. Model explain ability

4.1 SHAP-based analysis

4.2 Feature dependency analysis

4.3 ICE and PDP

5. Limitation and future recommendations

6. Conclusions

Supporting information

S1 Data. SI data 1.

References