Construction project risk prediction model based on EW-FAHP and one dimensional convolution neural network

In order to solve the problem of low accuracy of traditional construction project risk prediction, a project risk prediction model based on EW-FAHP and 1D-CNN(One Dimensional Convolution Neural Network) is proposed. Firstly, the risk evaluation index value of construction project is selected by literature analysis method, and the comprehensive weight of risk index is obtained by combining entropy weight method (EW) and fuzzy analytic hierarchy process (FAHP). The risk weight is input into the 1D-CNN model for training and learning, and the prediction values of construction period risk and cost risk are output to realize the risk prediction. The experimental results show that the average absolute error of the construction period risk and cost risk of the risk prediction model proposed in this paper is below 0.1%, which can meet the risk prediction of construction projects with high accuracy.


Introduction
With the continuous development of science and technology, the complexity of construction projects continues to increase, the construction period continues to grow, and there are many uncertain factors [1], in order to reduce the probability of risk occurrence and effectively avoid potential risks to the entire project, it is necessary to predict the risks of the construction projects.
In reference [2], the subway project construction risk management method is based on Bayesian network. In reference [3], the Fault Tree Analysis (FTA) method is combined with Bayesian network, and a Bayesian network-based shale gas well blowout risk analysis method is proposed. However, Bayesian networks are based on prior probabilities. In many cases, prior probabilities depend on assumptions, which will largely lead to poor prediction results. Combine with AHP theory, use rough set to analyze project risk group decision to realize attribute reduction, and combine with analytic hierarchy process to realize quantitative research and analysis of project risk. Literature [4] uses AHP to evaluate solid waste treatment methods in Libya, but it is difficult for AHP to check and adjust the consistency of the judgment matrix. Literature [5] proposed the use of fuzzy analytic hierarchy process and rough analytic hierarchy process to evaluate traffic accessibility method. Literature [6] proposed the use of IT2FS-DEMATEL to eliminate less important indicators, combines the IT2FS-AHP method to sort the final indicators, and establishes a multi-index decision-making model. But literature [5] and literature [6] involve risk prediction, rough set and IT2FS-DEMATEL may have a greater impact on the final prediction results after removing redundant attributes.
With the development of artificial intelligence and big data, neural network has attracted more and more researchers' attention. Because neural network has a strong nonlinear fitting ability and has a good effect on mapping nonlinear relations, relevant scholars have combined neural network with engineering project risk research in recent years and achieved certain results [6,7]. Literature [8] proposed a railway construction risk assessment algorithm based on BP neural network. The expert scoring method was used to establish initial sample data, and the BP neural network prediction model was used to learn and predict the samples to get the risk score of each construction project. However, BP neural network has some problems, such as not the best approximation of continuous function and long training time. Literature [9] proposed a project risk evaluation algorithm based on PCA (principal component analysis)-RBF neural network on the basis of BP model, which improved the shortcomings of BP neural network that it is difficult to obtain the optimal network, but the RBF neural network the center of the hidden basis function is selected in the input sample set, which in many cases can hardly reflect the real input-output relationship of the system. In order to solve the problems of the above-mentioned neural network, literature [10] proposed a method for predicting the risk of underground engineering rockburst based on ANN and ABC. (the artificial neural network (ANN) and artificial bee colony (ABC), in order to further improve the prediction accuracy, the paper uses the artificial bee colony algorithm to optimize the artificial bee colony algorithm, but the artificial bee colony algorithm has weak search ability and relatively slow convergence speed.
As construction projects become large-scale and risk factors continue to increase, traditional risk predictions mostly use regular event analysis, correlation analysis and other methods to analyze key indicators and detailed records, which rely heavily on manual extraction by professional workers. The current risk assessment of construction projects adopts a single expert scoring method, entropy weight method, analytic hierarchy process or fuzzy analytic hierarchy process, which does not fully combine multiple evaluation methods, resulting in incomplete detailed factors affecting project risks, and lack of objectivity and accuracy of inspection and evaluation results. At present, Analytic Hierarchy Process (AHP) and its derivative methods are still the most widely used and most effective risk assessment in the complex large systems. Among them, the Fuzzy Analytic Hierarchy Process (FAHP), which integrates fuzzy theories, improves the weight determination problem of AHP, and its practicality and simplicity have been applied more and more widely [11]. In order to improve the closeness between the weight of evaluation index and reality, this paper adopts the entropy weight-fuzzy analytic hierarchy process (EW-FAHP method) to determine the weight. The risk prediction method based on traditional neural network risk prediction requires too many weights, which reduces the accuracy of project prediction to a certain extent [12]. The current risk assessment of construction projects uses a single CNN network with different convolution kernels to perform convolution operations on the input data, thereby obtaining global features of the data, and then down-sampling the extracted features through pooling operations, reducing the amount of calculations. At the same time, it can also suppress the overfitting of the model to a certain extent.
Therefore, this paper uses entropy weight method and fuzzy analytic hierarchy process to evaluate the construction period and cost index system of the construction project, proposes a construction project risk prediction model based on EW-FAHP and 1D-CNN, identifies the existing risks of the construction project through reference analytical method and constructs risk evaluation index system. Using Entropy Weight (EW) and Fuzzy Analytic Hierarchy Process (FAHP), the risk weight of each risk evaluation index is determined by combining subjective and objective evaluation methods. One dimensional convolution neural network model is constructed to train and learn the risk weight of construction project. The duration risk and cost risk of construction project are selected as the output unit of convolution neural network. The average absolute error between the predicted value and the actual value of duration risk and cost risk is analyzed to realize the risk prediction of construction project.
The prediction results of the risk prediction model proposed in this paper show that the method has strong practicability in the early stage of the project and in the construction process. Compared with other commonly used forecasting algorithms, the forecasting accuracy has been significantly improved, which is of greater reference value for project decision-makers.

Project risk identification
Starting from the decision-making stage, various risks affecting the construction project duration and costs will arise as the project progresses. For the construction project management and construction parties, it is necessary to target the entire implementation process of the project with limited resources. Accurately identify the risk factors that have a greater impact on the construction period and costs [13][14][15]. For construction project management and construction parties, it is necessary to accurately identify the risk factors that have a greater impact on the construction period and cost for the entire implementation process of the project under the condition of limited resources.
Project risk identification methods mainly include: brainstorming method, literature research method and rough set theory. Each method has its best applicable environment, and suitable identification methods can be selected according to different analysis angles, routes and focuses [16]. Compared with other identification methods, literature research method is not limited by time and space, and can realize risk identification even with a small amount of resources. This method has been widely applied in intelligent algorithms, big data analysis, fault diagnosis, etc. [17,18]. This paper selects construction projects invested by state-owned assets, controlled by stateowned assets or directly managed by government departments for analysis. Therefore, literature analysis is adopted to identify risk factors and summarize project risk evaluation indexes. Combined with the attributes of construction project risk, the project risk preliminary evaluation index system is obtained.

Project risk assessment
In project management, project risk evaluation refers to the process of analyzing, estimating and quantifying the impact of risks on the project. Establishing a scientific and effective risk assessment method is the prerequisite for risk research. The flowchart of the risk assessment process is shown in Fig 1. There are many methods for quantitative analysis of risk, because there are some risks in large construction projects that cannot be evaluated by statistical and mathematical modeling methods. In order to improve the consistency of analytic hierarchy process(AHP), this paper selects expert scoring method and fuzzy comprehensive evaluation method to evaluate the risk factors.

Construction of project risk evaluation indicators
In construction project management, choosing an appropriate risk evaluation index system is the prerequisite for controlling project risks. The selection of risk assessment indicators should meet the requirements of representativeness, diversity, conciseness and comprehensiveness [19,20]. The hierarchy of the system determines whether the evaluation index system is scientific and reasonable. Therefore, when constructing the project risk evaluation index system, the evaluation indexes are divided according to the defined grade categories, and finally a multi-level index system is constructed to help risk managers understand the specific conditions of the risks in the project more comprehensively [21,22]. As is shown in Fig 2, the primary indicators are R i (i = 1,2,L 6), and the secondary indicators corresponding to each primary indicator are U i (i = 1,2,L 25).

Risk assessment of construction project
The evaluation of risk is measured by the degree of deviation between the final result of the project and the previous target. The degree of deviation is positively correlated with the risk. The greater the degree of deviation, the greater the risk.  Through the research and analysis of interval fuzzy numbers, it is found that the existing processing method is to directly model and predict the two boundary points. Doing so often leads to a failure to describe the overall development trend of the sequence and the results predicted by the model are prone to be confused, etc., which results in the failure of predictions. Z-number is a more anthropomorphic way of representing uncertain information. The existing references on Z-number research, especially theoretical research, is still in its infancy. A prominent feature of mainstream research in existing theoretical aspects is that the amount of calculation is relatively large, not easy to be understood, and is not conducive to actual engineering applications, particularly inconvenient to handle emergency management that requires high time complexity.
Although the fuzzy analytic hierarchy process overcomes the defects of analytic hierarchy process in the process of calculation, its evaluation results are still calculated based on the experts' scores, which makes the evaluation results inevitably mixed with some experts' personal views. The entropy weight method, by contrast, is mainly based on actual data, without combining some special cases, and the evaluation results are relatively objective. Therefore, I want to obtain the subjective weight and objective weight of each factor through fuzzy analytic hierarchy process and entropy weight method respectively, and then combine the two to obtain their comprehensive weight.
In practical application, the combination of subjective and objective methods are not the same, mainly including mean value method, product method, gray correlation method, etc. However, these combination methods only use the subjective and objective weights of the lowest-level indicators for a relatively simple combination, ignoring the effective integration of the intermediate steps of the two methods. This will cause the calculated weight to be different from the true component in the evaluation process, which deviates from the actual situation. Therefore, a new combination method is adopted, which not only considers the combination of the underlying index weights, but also considers the organic integration of the intermediate processes.

Definition of construction project risk
In the early stage of decision making, in order to avoid losses, the risks of construction period and cost can be used to evaluate the risk of the whole project before deciding whether to bid. Construction period risk and cost risk can be expressed by Eqs (1) and (2): In the formula, R T and R C represent the construction period risk and cost risk respectively, T Δ and T 0 represent the actual construction period and the target construction period respectively, C Δ and C 0 represent the actual cost and the target cost respectively.

Fuzzy hierarchy comprehensive weighting method
Fuzzy consistent judgment matrix. In order to quantify the decision judgment and form a numerical judgment matrix, the relative importance is given by using the 0.1~0.9 scale method [23], and the number of index layers is set as n, and the initial matrix of Eq (3) is constructed According to formula (3), the fuzzy consistent discriminant matrix of formula (4) is constructed In the equation: According to formula (4), the normalized weight of formula (5) is constructed Iterative weight of power method. According to the definition of the power method, the reciprocal matrix of formula (3) is obtained: In the equation e ij = r ij /r ji , W R is the initial vector V (0) , let: In the equation: ; If kV ðkþ1Þ k 1 À kV ðkÞ k 1 � � � � � ε is satisfied in the iteration process, the iteration is stopped; wherein ε is an error, herein take ε = 0.0001. Based on the above derivation, V (k+1) is normalized to obtain the FAHP weight vector:

Entropy weighting method
The data in formula (3) is normalized [24], and the equations are obtained: define: Wherein H j is j th secondary indicators, d ij ¼ r ij = X n i¼1 r ij , k = 1/ln n, then the entropy weight of the jth second secondary index is: In the equation According to Eq (10), the weights obtained by the entropy weight method can be obtained: Integrate the weights W obtained by FAHP and the weights obtained by entropy weight method, and the EW-FAHP comprehensive weights can be obtained respectively: In the equation: EW-FAHP combines Fuzzy Analytic Hierarchy Process (FAHP) and Entropy Weight Method (EW), and uses a combination of subjective and objective methods to calculate risk weights. Compared with a single method, the evaluation result is closer to the real situation.

EW-FAHP weight calculation example analysis
This paper presents the process of determining the weight of each index of public relations risk, and the weights of other risk factors can be determined sequentially.
The relevant data comes from the data of a Sichuan group's entire construction project in a community in Chengdu. First, use the expert scoring method to fill in the proportional scale table for the public relations risk factors of the construction project, and the following matrix can be obtained: A ¼ According to formula (4), it can be concluded that:  Table 1 shows the construction period and cost information of a Sichuan group in a community in Chengdu. According to Eqs (1) and (2), the construction period risk and cost risk value are obtained. Table 2 shows the weights of relevant indicators at all levels.
To sum up, the risk assessment values of other projects can be obtained by the above methods. For the risk prediction of construction projects, the sample data should reflect its internal laws as much as possible while taking into account its own characteristics. After obtaining the actual conditions of 40 construction projects, 10 experts combined the evaluation indicators to evaluate the risk factors of construction projects, and a total of 40 sets of data were obtained.
In this paper, after obtaining the actual situation of 40 construction projects and combined with evaluation indexes, 10 experts evaluated the risk factors of construction projects, and a total of 40 groups of data were obtained. In view of length, Table 3 gives the comprehensive weights of EW-FAHP for all levels of indicators for 15 groups of construction projects.

The basic principles of convolutional neural networks
With the development of neural networks, convolutional neural networks have been applied in more and more fields. They are currently widely used in visual image analysis, natural language processing and recommendation systems [25], but they have not yet been effectively applied in the risk prediction of construction engineering projects. The CNN (Convolutional Neural Networks) network is an extension of DNN (Deep Neural Networks). It mainly includes input layer, output layer, convolutional layer and pooling layer. A convolution kernel of CNN only extracts one feature, and multiple features are extracted by multiple convolution kernels and then integrated in the fully connected layer [11,12].   The 1D-CNN network structure mainly includes five parts: input layer, convolutional layer, pooling layer, fully connected and output layer. For the input one-dimensional information vector, the vector passes through the convolutional layer and the pooling layer. Finally, the corresponding output is obtained through the fully connected layer.
(1) Convolutional layer: Suppose the input signal of the 1D-CNN model is x, the length is N, and the convolution kernel is used to perform convolution operation on the local area of the input signal. The specific convolution operation formula is: Where: Ker k L 1 represents the kth layer convolution kernel whose length is L 1 ; � indicates that the convolution operation x k i means the i-th input sub-segment (which is the same length with the convolution kernel); b k i represents the offset of the i-th convolution output of the k layer; y k i represents the convolution output of the k-th layer; st 1 is the convolution step length, where y k ¼ ½y k 1 ; y k 2 � � � y k N=st 1 �. The non-linear processing of the data after the convolution operation is as follows: In the equation above, s represents the activation function of y k . This article uses ReLu, the mainstream activation function in the deep learning world, which can accelerate the model convergence and overcome gradient dispersion.
(2) Pooling layer: Pooling layer reduces the calculation amount and reduces the risk of overfitting by reducing the parameters of the neural network. Maximum pooling can be used to obtain position-independent characteristics. The pooling operation is usually the maximum pooling (max-pooling), as shown in Formula (14), the sequence length can be reduced in dimension.
In the equation: j ¼ 1; 2 � � � N L 2 st 1 , where s j t represents the t-th value of the jth pooling segment, a j represents the maximum value of the jth pooling segment; L 2 represents the length of the pooling segment The output of the pooling layer is: Where a is the output vector of the pooling layer.
(3) Fully connected layer: The fully connected layer has the same structure as the traditional neural network and is composed of multiple hidden layers. The fully connected layer further abstracts and combines the global timing features, and the output is as follows: In the equation, w o and b o are the weight and bias of the fully connected layer, respectively.

Parameter training of 1D-CNN
Similar to the traditional artificial neural network, the parameter training process of CNN uses the back propagation algorithm [25], and the training process is shown in Fig 4. 1) Initialize the network parameters, construct a network model with appropriate unit depth according to the actual requirements of the construction project risk value samples and predictions, and determine the network parameters (such as learning rate, number of iterations, step length, etc.); 2) Input the risk value samples of construction projects into the network, and obtain the error between the network output and the expected target through forward propagation; 3) Determine whether the network converges, if the network does not converge, go to step 4, otherwise go to step 5; 4) Back propagation and weight modification, using BP (Back Propagation) algorithm, the error obtained in step 2 is propagated backward layer by layer to each node, and the weight is updated. Repeat steps 2-4 until the network converges; 5) Determine whether the current network meets the actual requirements according to the recognition accuracy of the test sample, if it meets the requirements, perform step 6, otherwise skip to step 1, and modify the network parameters; 6) Output the average absolute error of the construction project risk and the predicted value of construction period risk and cost risk.

Simulation analysis
Operation process of prediction model  corresponding model parameters are changed to make the output prediction result converge. Finally, the corresponding prediction results are obtained.

Simulation conditions
The experimental test platform parameters in this article are Windows 10 Professional 64-bit, processor model (CPU) i7 9850H, main frequency 2.6GHz, and memory (RAM) 2×8GB. The software framework is the Keras deep learning tool backed by Tensorflow, which can realize rapid model construction and experimental development. In this paper, the number of simulation samples is 40, of which 30 are training samples and 10 are verification samples. Table 4 is the parameter setting of the engineering project risk prediction model based on onedimensional convolutional neural network (taking cost risk as an example), and similar construction period risks are not listed.

Analysis of experimental results
Through the prediction of construction project period risk and cost risk, the construction unit can measure the risk of the project and make reasonable decisions based on the predicted value to ensure that the project will not suffer losses.

PLOS ONE
the LOSS values of training set and verification set keep decreasing and approaching gradually, indicating that the learning process of CNN converges. By performing steps 5 and 6, the predicted values of project duration risk and cost risk can be obtained.
In the experimental results, Mean Absolute errors of construction project duration risk and cost risk are shown in Table 5. The average absolute error index describes the relative deviation degree of risk prediction. The smaller the value is, the higher the prediction accuracy of the model is, and the risk can be effectively predicted.
According to the results in Table 5, the training set MAE of the final results obtained by the CNN prediction model is below 0.002, and the verification set MAE is below 0.001. The proposed algorithm model is convergent, and the model is able to predict risks with high accuracy, meeting the engineering application requirements of the risk prediction of construction projects. Table 6 shows the comparison between the actual and predicted risk values of 10 groups of project No. 6-15 in Table 3.

Sensitivity analysis based on input variable disturbance
In order to find out the sensitive factors which have an important impact on the construction period and cost indicators from many uncertain factors, and analyze and measure the degree  of influence and sensitivity on the construction period and cost indicators, this article further analyzes the impact of six sensitive factors, including economy, environment, technology, society, public relations, and natural risks, on construction period and cost indicators. For the neural network model, the only things need to know are the input variable data and output data, without the need of prior knowledge. It can carry on training and learning to the training data set, with a lot of simple artificial neuron nonlinear relationship between the simulated data, and can adaptively adjust the connection weight between neurons, so as to establish a network structure that can better reflect the true situation of the data. Therefore, this paper inputs the single sensitive factor into the prediction model under the premise that the other five sensitive factors remain unchanged by +5% and -5%. Sort the sensitivity according to the change size of the output index. Table 7 shows the prediction results when a single sensitive factor changes +5% and -5%. As can be seen from Table 7, the sensitivity coefficients of the six sensitivity factors are 0.1404, 0.0067, 0.1284, 0.0906, 0.1274 and 0.0636 respectively for the risk of construction period, and 0.0596, 0.0088, 0.0442, 0.0255, 0.0433 and 0.0016 respectively for the risk of cost. Based on the above analysis, it can be seen that for project duration risk and cost risk, the order of sensitivity is economic risk, technical risk, public relations risk, social risk, environmental risk and natural risk. Therefore, comprehensive prediction results and analysis of sensitivity factors show that, for a Sichuan group's construction project in a community in Chengdu, efforts should be made to resolve economic risks, technical risks, and public relations risks before the project starts, so as to avoid project delays and economic losses.

Comparative analysis of predictive model performance
In order to verify the effect of the 1D-CNN risk prediction model on the risk prediction proposed in this article, BP (back propagation), SVM (Support Vector Machine) and ELM (Extreme Learning Machine) networks were selected to predict and compare the construction project duration risks and cost risks discussed in this article. The sample data uses 10 sets of data of item numbers 6-15 in Table 3 for risk prediction, and compares them with the real values. Table 8 shows the relevant information of a Sichuan group company in the construction of a residential district in Chengdu, as well as the prediction results by using various prediction models. It can be seen from Table 8 and Fig 8 that the CNN risk prediction curve has the smallest error and the closest curve to the true value, indicating that the CNN risk prediction model proposed in this paper has better prediction accuracy and effect than other risk prediction models.
The prediction model proposed in this paper can be extended in the future research, the prediction results of the risk prediction model proposed in this paper demonstrate that the method has strong practicability in the early stage of the project and in the construction process. Compared with other commonly used prediction algorithms, the prediction accuracy is significantly improved, which has great reference value for project decision-makers.

Conclusion
This paper proposes a project risk prediction model based on EW-FAHP and one-dimensional convolutional neural network. By selecting the risk evaluation index of construction project, the corresponding risk value is established by combining the EW-FAHP, and then the risk value of construction project is input into the established one-dimensional convolution neural network model for training and learning. The construction project duration risk and cost risk are selected as the output units of the neural network risk prediction. The experimental results show that: (1) The EW-FAHP weight calculation method proposed in this paper realizes the combination of subjective and objective weighting method and reduces the influence of human factors on the weight. At the same time, the one-dimensional convolutional neural network has strong reliability and high accuracy in the prediction of construction project duration risk and cost risk, which can meet the engineering application conditions.
(2) In the case of a certain number of samples, during the neural network training process, the risk Loss value continues to decrease with the number of iterations, and the network converges. This verifies that the risk prediction model has high stability and can provide a reasonable basis for project managers' early decision-making and can effectively reduce risks. It can provide a reasonable basis for project managers' early decision-making and can effectively reduce risks. In addition, due to the difficulty of obtaining sample data, if more relevant data of construction projects can be obtained by combining relevant big data, the prediction accuracy of the 1D-CNN risk prediction model will be further improved, and the prediction results will be more convincing.
In the practical engineering application, it aims at different types of risk prediction requirements, such as investment risk, traffic risk, coal mine safety and disease risks, etc. After sorting and collecting relevant data, EW-FAHP or other combination of subjective and objective weight determination methods are applied to determine the comprehensive weight of risks affecting the prediction results. Then, the comprehensive weight data is input into the 1D-CNN prediction model for learning and prediction, and the prediction results are also of great reference significance.