Risk assessment of earthquake network public opinion based on global search BP neural network

Background The article proposes a network public opinion risk assessment model for earthquake disasters, which can provide an effective support for emergency departments of China. Method It uses the accelerated genetic algorithm (AGA) to improve BP neural network. The main contents: This article selects 10 indexes by using the methods of the principal component analysis (PCA) and cumulative contribution (CC) to assess the risk of the earthquake network public opinion. The article designs a BP algorithm to measure the risk degree of the earthquake network public opinion and uses AGA to improve the BP model for parameter optimization. Results The experiment results of the improved BP model shows that its global error is 7.12×10, and the error is reduced to 22.35%, which showed the improving BP model has advantages in convergence speed and evaluation accuracy. Conclusion The risk assessment method of network public opinion can be used in the practice of earthquake disaster decision.


Introduction
By June 2018, the number of China's internet population reached 802 million, and internet popularity is up to 54.3%. The internet popularity provided a convenient for internet user to express their attitudes and views, and their attitudes and views might come into being network public opinion. The opinion spread through network media, more or less, that it can promote the physical attributes of the event itself included motivation, sensitivity and promotion factors, which often were overlooked. The external factor indexes were mostly gained from the quantitative indexes, while some kinds of qualitative indexes were ignored. For example, the key uncertain indexes of network public opinion under the emergency were ignored. The other aspect was the lack the researches of comprehensive risk evaluation in network public opinion under earthquake disaster, and current studies are mostly around the distribution, linear fitting and prediction of hot spots. A few studies focused monitoring indexes and evaluation approaches, but these monitoring indexes were mostly gotten from the social attributes, and barely involved in the physical attributes indexes. Furthermore, the evaluation method is relatively simple in many articles, and many researchers neglected the nonlinear, high dimensional and non-normal problems of indexes data, which may bring a significant deviation between result and the actual situation. According to the characteristics of network public opinion, the BP method can solve the question of risk assessment. BP neural network is a negative gradient optimization algorithm, which has the advantages of adaptive, self-organizing, fault-tolerant and robustness, and is easy to compile on the computer. It can effectively remove the limitations of multi-index nonlinear, high dimension and non-normal evaluation of the earthquake network public opinion risk, moreover it can also ensure the reliability of the assessment results by approximating any square-integral nonlinear continuous function with the accuracy of arbitrary mean squared deviation. The article will propose a set of risk assessment indexes, which can cover the whole public opinion assessment cycle from both aspects of physical properties and social attributes. It will adopt the method of BP neural network to evaluate the risk of network public opinion under background of earthquake disaster, and use AGA model to improve BP neural network for increasing the reliability of the assessment results.

Determination of index
The spread of network public opinion requires unusually a certain process, and the occurrence and expansion of risk are mainly driven by public sentiment carrier, subject and itself. Network media is the carrier of earthquake public opinion, and it belongs to the condition factor. Network users are the main part of the earthquake public opinion, which is the promoting factor. Earthquake disaster belongs to the induction factor. The condition factor and the promoting factors belong to the social attribute of the diffusion and propagation) of the earthquake network public opinion, and the induction factors are the physical, and then the sensitivity factor can lead to a driving effect to the network public opinion. On the design of risk indexes, this study focused on both the physical and social attributes of the earthquake network public opinion.
(1)The physical attributes of the earthquake network public opinion. It's determined by the damage degree of the earthquake. Generally speaking, the damage degree is the heavier, and the social influence also is the higher, and then the stronger the ability of the diffusion of the earthquake network public opinion. There are two types of indexes for assessment the physical attributes of the earthquake network public opinion, including damage level and disaster degree. The damage level can be measured by the indexes of earthquake magnitude and central intensity, and the indexes of disaster degree can be measured by the fatalities, affected area and property loss, etc.
(2) The social attributes of earthquake network public opinion. It can be measured by indexes of emergency response ability, network media force and diffusivity ability. The emergency response ability has a direct impact to relieve disaster losses and negative emotions of network users. Emergency response ability mainly includes the emergency resources, reaction ability of government, ability of monitoring and early warning, etc. The network media force mainly consists of the number of original topics, comments and forwarding, etc. The diffusivity ability consists of the ratio of the clicks amount to responses and the change rate of the original topics.
This article put forward firstly 13 indexes from physical properties and social attributes. To verify all indexes of independence, correlation and representative, this article adopted the method of expert scoring to gain the initial data and used Cronbach Alpha to measure the reliability, and used Kaiser-Meyer-Olkin (KMO) and Bartlett sphere tested the availability of the questionnaire. The result showed Cronbach's Alpha = 0.871, so the questionnaire, be proved, is reliable; where, the KMO = 0.743, and the Bartlett spherical = 0.000, which indicated that the questionnaire was quite available. The result showed P < 0.001 in the Bartlett sphere test, and it suggested that there were some repeated interpretation indexes in 13 indexes, so the article deleted the index of "the affected scope", its relevancy > 0.6. To test the correlation of indexes, and the article deleted these indexes, which correlation <0.651, such as central intensity of earthquake. Secondly, a principal component analytical method was used to find the explanation ability of all indexes. Now there were two results be needed, including the contribution rate and the cumulative contribution rate of each component. The criterion of judgment was: looking on the cumulative contribution rate of 85%+ as the final evaluation indexes (as shown in Table 1).
In Table 1, the data of earthquake emergency response ability was determined by the experts, who adopted the Delphi method. The data of the network force came from the Beijing Qingbo big data technology co. Ltd through gaining the data of posts, comments and forwarding around 6 network medias, such as Wechat, MicroBlog, web pages, Newsarticles, the client and the BBS [21]. The data came from the qualitative score of experts, public opinion data system and the official website of china. The data, including the amount of original topic, forwarding amount and commended amount, was counted within 30 days after the earthquake. In addition, the raw data for this manuscript comes from the research team's questionnaires, interviews and measurements.

The risk level
Risk level standard of the earthquake network public opinion consisted of two parts: setting the risk level and confirming the interval value of each index. This article referred to Table 1. Network public opinion risk indexes.

Physical attributes
Damage level Earthquake magnitude According to the official data Incidence of secondary disaster According to historical data to estimate the risk of the secondary disaster, the occurrence probability of M4+ aftershocks: main shock magnitudes is M6, incidence of secondary disaster is 30%; Main Shock is M7, incidence of secondary disaster is60%, main shock magnitudes is above M8,incidence of secondary disaster is 90%.
Disaster degree Casualties According to the official data Property loss According to the official data international conventions and china studies, then divided the risk level of earthquake network public opinion standards into 4 levels: v = {Ⅳ,Ⅲ

Social attributes
, Ⅱ, Ⅰ} = {higher risk, high risk, general risk, low risk}; Valuation: "higher risk" for 4, "high risk" for 3, "general risk" for 2 and "low risk" for 1. For confirming the interval value of each index, the article choices "The Chinese public events database" made by Shanghai Jiaotong University public opinion research laboratory as the date source [22]. The study selects 150 cases, which occurs after 21 century and the seismic grade lies in 3~9. The qualitative indicators assigned according to Table 1 and got data through the questionnaires, and the quantitative indexes assigned according to the objective data in the specific period, among them, the earthquake magnitude, casualties and property loss risk standard according to the "act of China earthquake emergency preparedness". The risk level of rest indexes is determined on the basis of the research result of Jiang J C (1996) [23]. In addition, the data of earthquake network public opinion indexes are usually uncertainty, and the risk level standard of each index is difficult to be determined, so it is more reasonable to use fuzzy interval numbers to express the risk level of each index. For that reason, the four points risk partition method is used to determine the risk level standard of earthquake network public opinion, the specific approach: firstly, arranging the value of z(i) by ascending order, look on z (i) max as the uppermost value, and z(i) min is regards as the lowest limit value. Median z(i) med is regards as the general risk value. Secondly, finding out the median z � (i) med from z(i) min and z (i) med , and the median z �� (i) med from z(i) med and z(i) max , as shown in Table 2.
Furthermore, according to the Table 2, we calculated the standard interval value of each indicator of in different risk level, as shown in Table 3.

The AGA-BP assessment model
The slow learning speed of BP neural network and the existence of local minimum problems will affect the network's predication ability [24][25][26]. To solve these problems, the AGA was used to improve the conventional BP neural network, because the AGA can optimize the network parameters of BP. After that the optimized parameters will be treated as the initial value of the BP algorithm. It can effectively enhance BP neural network's extrapolation ability, as well as preventing the network from entering partial circulation.

Steps to build the AGA-BP model
In this article, the BP neural network is consisted of three layers (Liu C X et al, 2017) [27]: the first layer is the data input layer, the second layer is the hidden layer, and the third layer is the data output layer. Each layer is connected by some nodes, which are usually the place where data is input and output (As shown Fig 1).
Here is no strict restriction on the number of the output layer nodes n and that for the hidden layer node m, as the existing three-layer BP neural network can approximate to the mapping between input layer and output layer by arbitrary precision. According to experience, the number of nodes of the hidden layer m is relatively small; if the m relatively bigger, the generalization capability and training speed of BP network would be affected. Generally speaking, the range of value m is generally controlled in [n, 2n+1] and as small as possible under the conditions of permissible precision. In this study, three layers of BP neural networks were selected, Table 2. Earthquake network public opinion risk rating standard.
i: the hidden neuron. θ j : the threshold of the output neuron.
w hi :the connection weights between the Input layer neuron and the hidden neuron w ij :the connection weights between the hidden neuron and the output neuron x: input point y: output point Step 1: Acquiring random sample points of indexes. In the BP neural network training, at first, it is necessary to generate without dimensionless of random sample data of the risk grade of network public opinion under earthquake disaster. After training the optimal parameters through the random sample data, then the BP model is tested with the actual sample data again. In the acquisition of random sample data, this study referred to the research results of Wang S (2006), and used the uniform random number generated the sample value of indexes in the change rang of indexes, which are expressed by x � (k,j). While the standard value of risk level of network public opinion under earthquake disaster is expressed by y(k) = i. In order to fully reflect the information of the boundary value of each index, the boundary value of every index is used only one. The arithmetic mean of two risk grade values related to the boundary value is taken from the risk rank value, so that the sample series, which is expressed by {x � (k,j),y(k)}, k = 1~nk,j = 1~nj, in which the mean of nk is the number of samples, of the risk assessment standard of the earthquake disaster network public opinion can be obtained. Further, the study need eliminate the dimension of indicators to make the evaluation model universality. The dimensionless of the indexes is as follows, Step 2: Initialization of BP model. The samples values of the input and output of the earthquake network public opinion for machine learning are set as {x hk ,d k |h = 1,2,. . .,n; k = 1,2,. . .,N}. What's more the connection weight and the threshold value between the nodes are given on the (-1, 1) interval.
Step 3: For k = 1. Inputs and outputs of each layer {x hk ,d k } are provided to the network, where (h = 1,2,. . .,n; k = 1,2,. . .,N) Step 4: Calculate the input x i and output y i of each nodes of the hidden layer. Output the input of layer node x j and the output y i , w hi x hk þ y i ; y i ¼ 1=ð1 þ e À x i Þ; ði ¼ 1; 2; . . .; nÞ ð4Þ Step 5: Calculate the output layer node receive the change rate of the single sample error E k = 0.5(y j −d k ) 2 with the change of the total input and the hidden layer nodes receives the change rate of single sample error E k with the change of total input, Step 6: amended the weight and threshold of each connection, Where, m: number of correction. η: learning rates and η2(0,1). α: Momentum factor and α2 (0,1).
Step 7: For k = k+1, go to step 3 till N sample points are trained, after that go to step 9.
Step 8: Go to step 2, a new round of learning is performed until the network global error func- is less than a pre-set value or the number of studies is greater than the pre-set value. It need to determine the optimal value of θ i 、 θ j 、 w hi and w ij of the BP network to minimize the global error function of Eq (7), and at the same time, to stabilize the weight and threshold of network connection at all levels. What's worse the slow learning speed of BP neural network and the existence of local minimum problems will affect the network's extrapolation ability to a great extent. This study uses AGA to optimize BP neural network parameters, which are looked as the initial value of BP neural network to avoid the shortage of BP neural network. The optimization steps are as follows: 1. Constructing the change interval of BP neural network parameters. Set c j is the value of any parameter of the network when the training network of BP algorithm shows the convergence speed at a slow pace. Change interval of c j is [a i ,b j ], in which a j = c j −d|c j |,b j = c j +d| c j |, d is a positive constant.
2. Coding the BP neural network parameter. e refers to the length of the code, which divide the interval [a j ,b j ] into 2 e −1 subintervals, therefore, the entire network parameter variation space is discredited into grid points (2 e ) p . Among them: p = 2n 2 +n+1. Each grid point is seen as an individual, which corresponds to a possible value state of p parameters, which is represented by the e bit binary number of p. Therefore, the network parameters, grid points, individuals and binary numbers of p correspond to each other.
3. The random generation of the initial parent group, and the evaluation of the individual fitness of the parent. n points are randomly selected from the above (2 e ) p grid points as the initial parent group. The network of global error function value E i is got with the i-th individual getting into Eq (7). The smaller the E i is, the stronger the individual ability.
4. The Selection, hybridization of the parent. The parent is ranked according to the optimal function value, and the first few individuals in the first order are called excellent individuals. Construct the inverse function p i against E i , for p i >0, p 1 +p 2 +� � �+p n = 1, totally i were selected with p i probability. Thus, two groups, which contain n individuals, were selected, and two pairs of individuals were randomly paired into n pairs, and then the binary array of parents was exchanged for any value of the binary array, and the two groups of offspring individuals were obtained.

The variation of offspring individuals.
A group of offspring individual is randomly selected from the hybrids of the parent. The arbitrary two values of their binary array are flipped by the mutation rate, which means the original value will be changed from 0 to 1, vice versa.
6. The iteration. The offspring individual of n from step6 is seen as the new parent. The algorithm is transferred to the parent individual fitness evaluation step, and entered the next generation evolution process.
7. Cyclical acceleration. Using the variation range of the excellent individual parameters are produced by the first and second evolutionary iterations that is regarded as the new initial range of parameters, then the algorithm enters the network parameter coding step, the process always is repeated following as step 2. Like this again and again, the parameter change interval of the excellent individual will gradually shrink, and the optimal distance will be closer and closer until the given acceleration times are reached.
Step 9: Set the untested evaluative value of each single evaluation model in K period as input sample. That is, inputting the network that is been completed. The output value of network, by the inverse treatment of normalization, just is regards as the combination evaluation value F k .

Control parameter setting
In the AGA-BP model algorithm, the control parameters are set as follows: The learning rate of BP neural network: η = 0.1.
The momentum coefficient of BP neural network: a = 0.1.
The coding length of AGA: e = 10.
The rate of AGA variation: p m = 1.0.
The number of individual parent: q = 300.
The number of excellent individual: s = 10.

Selection of sample data
The samples of this study came from "earthquake cases in China", in which 6 earthquakes cases be selected (as shown in Table 4). Earthquake magnitude, casualties and property loss came from the official website of China. The validation has two stages: firstly, the study uses the random data to train AGA-BP model to obtain the optimal parameters, and compared the AGA-BP with the BP neural network to observe their training precision. At the second stage, 6 samples data is loaded into the AGA-BP model to evaluate their risk degree.

Parameters training and assessment of AGA-BP model
The training of random sample date in AGA-BP Model. According to earthquake network public opinion risk standard in Table 3 and step 1, and then the risk standard of the 1~31 groups sample data were randomly generated, as shown in Table 5.
Using the formula (1)~(3) cope with the sample data to eliminate the difference measure unit of all indexes and then input the normalized sample data into the AGA-BP model to learn and train. The train number is 1,000 times. The optimization acceleration number of the AGA-BP is 4. At last, the calculation results were as Table 5. Further, input the random sample data into BP neural network, and their training accuracy as shown in Fig 2. The Fig 2 shows: the global error of the AGA-BP evaluation model after 1,000 times training is 0.000712, which meets the convergence requirement. The global error of BP neural Risk assessment of earthquake network public opinion network is 0.000917. Compared with the actual risk level, the accuracy of AGA-BP model is significantly higher than that of BP neural network. At this point, the stable weight values and threshold of the AGA-BP model are shown in Table 6. Case assessment. After 1,000 times training, the threshold and weights values of each layer of AGA-BP model trend a stability, and the accuracy satisfied nearly the requirements, so it shows that the AGA-BP model can evaluate the risk of the earthquake network public opinion. Then the article will evaluate the 4 cases in Table 6 by AGA-BP model, BP neural network and logistic curve to show the advantage of AGA-BP model. Some parameters are set as follows: the train times are 15,000 to the AGA-BP model and BP neural network, the optimal acceleration is 4 times, and the other parameters were constant. The logistic curve still adopts the fitting method for risk assessment. The results are shown in Table 7 and   Risk assessment of earthquake network public opinion

Result analysis
The results of the AGA-BP model on the risk assessment show that: 1. Earthquake magnitude, fatalities, property loss, posts, comments and forwarding are the key indicators that affect the risk level of earthquake network public opinion. The training results of the random data shows: there are 4 indexes, including the emergency resources, the capability of monitoring, early warning and comprehensive satisfactory degree of the victims, which have a low influence to the risk level of earthquake network public opinion.
2. The evaluation results of the AGA-BP model are consistent with the actual results and it's more feasible than BP neural networks. In Table 7, the BP neural network evaluates the Wenchuan earthquake as high risk, obviously, it's impractical. Actually, though the network public opinion of the Wenchuan earthquake spread rapidly, those numerous negative public opinions were prevented effectively by the government and experts in time.  Risk assessment of earthquake network public opinion

Conclusions
The risk evaluation system of earthquake network public opinion is established that is an essential task for improving the efficiency and capacity of emergency response. This study leads to the following conclusions, (1) This article put forward the risk monitoring indexes for earthquake network public opinion. According to appear feature and propagation law of the earthquake network public opinion, the primary risk monitoring indexes were set around the physical and social property. After all indexes were tested around the reliability and validity, the primary indexes were filtered through the principal component analysis (PCA) and rate of cumulative contribution, at last, the article put forward 10 risk evaluation indexes for earthquake network public. The result of verification shows that the 10 indexes can effectively evaluate the risk of earthquake network public opinion.
(2) The AGA-BP model is superior to BP neural network. The result of verification shows the convergence speed, parameter optimization and preventing premature convergence of AGA-BP model are all superior to the BP model. So the AGA-BP model can be used in the practice of earthquake network public opinion risk management.
(3) The accuracy of AGA-BP model is higher than the BP neural network. Training samples and example verification show that BP neural network can easily get into local optimal, and the accuracy of BP model is lower than the AGA-BP model. In addition, the training time of BP model is also longer than the AGA-BP model.
In the future, authors will research on network public opinion risk prediction is carried out.