Fe-based superconducting transition temperature modeling by machine learning: A computer science method

Searching for new high temperature superconductors has long been a key research issue. Fe-based superconductors attract researchers’ attention due to their high transition temperature, strong irreversibility field, and excellent crystallographic symmetry. By using doping methods and dopant levels, different types of new Fe-based superconductors are synthesized. The transition temperature is a key indicator to measure whether new superconductors are high temperature superconductors. However, the condition for measuring transition temperature are strict, and the measurement process is dangerous. There is a strong relationship between the lattice parameters and the transition temperature of Fe-based superconductors. To avoid the difficulties in measuring transition temperature, in this paper, we adopt a machine learning method to build a model based on the lattice parameters to predict the transition temperature of Fe-based superconductors. The model results are in accordance with available transition temperatures, showing 91.181% accuracy. Therefore, we can use the proposed model to predict unknown transition temperatures of Fe-based superconductors.


Introduction
Superconductors with the zero resistance and the Meissner effect have significant practical application [1]. The best known application is in the Magnetic Resonance Imaging (MRI) systems widely employed by health care professionals for detailed internal body imaging. Other prominent applications include the magnetically levitated trains without friction and electrical power transmission with no energy loss [2][3][4][5]. However, superconductors have superconductivity only at or below their transition temperature [6], which hold back the wide spread application of superconductors.
Researchers have been conducting an extensive search for novel superconductors, especially those with high transition temperature. High temperature superconductors such as cuprate superconductors containing CuO 2 planes [7][8][9][10], MgB 2 [11], hydride superconductors under extreme pressure [12][13][14][15][16][17][18], and Fe-based superconductors [19]. In particular, Fe-based superconductors have high transition temperature next to cuprates, an upper critical field above 50T, a relatively strong irreversibility field, and a high crystallographic symmetry [20], which attract the attention of researchers. In the process of exploring the influencing factors of Febased superconducting transition temperature, a strong relationship between the transition temperature and the lattice parameters is found [21][22][23][24][25][26]. According to composition and crystal structure, Fe-based superconductors are divided into four categories, including ReFeAsO (Re = rare earth elements) (1111 system), AFe 2 As 2 (A = K, Sr, Ba, etc.) (122 system), LiFeAs (111 system), and FeSe (11 system). At present, one of the main research directions of Fe-based superconductors is to improve their transition temperature via various doping methods and dopant levels [27,28]. The transition temperature is a key indicator to measure whether new superconductors are high temperature superconductors. However, the measurement of the transition temperature needs high precision devices including temperature controllers, constant current sources, and voltmeters, etc. These conditions cannot be achieved by ordinary laboratories. Meantime, it is necessary to artificially operate liquid nitrogen (77K) in the measurement process, and there are certain security risks. In addition, it mainly depends on liquid helium (4.2K) as refrigerant for superconductors that have strict temperature requirements. Because the equipment for liquefied helium is very complicated, and the liquid helium (4.2K) temperature is close to the absolute zero, the measurement of the transition temperature is very difficult.
Machine Learning (ML) is one branch of artificial intelligence while it is currently in the process of growth and evolution and is an active field in data science. One of the application of ML is data mining. In past decades, algorithms and theories corresponding to ML have had many advances, including the provision of useful data and robust computing infrastructures. Data mining is now rapidly applied to superconducting material science. Examples include using a Gaussian regression algorithm to predict physical parameters of superconductors [29][30][31][32][33][34][35][36][37][38]; using support vector regression [39], random forest algorithm [40], and XGBoost model [41] to predict high temperature superconductor candidates; and using GMDH-type neural network [42] to predict hysteresis loops of superconductors. The BP algorithm has excellent complex pattern classification and multi-dimensional function mapping capabilities, and it is applied in function fitting, data analysis and prediction. To avoid the strict measurement conditions and risk factors of the transition temperature measurement process, in this paper, we adopt a machine learning method to build a model based on the lattice parameters to predict the transition temperature of Fe-based superconductors.

Description of BP algorithm
The BP algorithm is a type of error back propagation algorithm. Error back propagation consists of forward propagation and feedback based on error signals. The forward propagation direction is input layer!hidden layer!output layer. The state of each layer of neurons only affects the state of the next layer of neurons. If the expected value is not obtained in the output layer, error signals will propagate back. The back propagation direction is output layer!hidden layer!input layer. By adjusting the weights and thresholds of each layer, the error will decrease along the negative gradient direction. The weights and thresholds are continuously iterated until the error meets the precision requirements. The algorithm process is shown in (Fig 1).

Calculation of BP algorithm
Assume a three-layer network with d-dimension input, l-dimension output, and q-dimension hidden layer, as shown in (Fig 2). In the network, the threshold of the j−th neuron in the output layer is θ j , the threshold of the h−th neuron in the hidden layer is γ h , the weight between the i−th neuron in the input layer and the h−th neuron in the hidden layer is v ih , and the weight between the h−th neuron in the hidden layer and the j−th neuron in the output layer is w hj .
The input of the h−th neuron in the hidden layer is: The input of the j−th neuron in the output layer is: where, b h is the output of the h−th neuron in the hidden layer. For a training example (x k , y k ), the output of a neuron isỹ k ¼ ðỹ k 1 ;ỹ k 2 ; . . . ;ỹ k l Þ, where, f() is an activation function. The mean square error is: where, y k j is the actual value. BP algorithm is an iterative algorithm, and the updating formula of parameter v is: The weight w hj between hidden layer and output layer is: where, η is the learning rate. According to the chain rule: According to the definition of β j : According to formulas (3) and (4): The updating formula of weights and thresholds is: In the formulas (12) and (13): By continuously iterating the weights w hj , and v ih , as well as the thresholds θ j , and γ h , the accuracy of the network will continue to improve. The performance of the trained network is evaluated by the mean absolute error (MAE), the root mean square error (RMSE), and the correlation coefficient (CC).

Data description
Data were obtained from Japan's National Institute for Materials Science (NIMS) at http:// supercon.nims.go.jp/index_en.html. After data processing, 203 sets of data were collected. To strengthen data relevance, the data only contained four representative types of Fe-based superconductors, namely 11, 111, 122, and 1111. The space distribution characteristic of data between the transition temperature and the lattice parameters is displayed in (Fig 3). The data were roughly divided into the 4 groups, corresponding to the 4 types of Fe-based superconductors. Each group has a certain degree of discreteness and a non-linear relationship, which meets the modeling requirements.
The visualization of transition temperature is shown in (Fig 4), which is discrete, and there is no aggregation. The data are distributed among 0-60 K, which is consistent with the transition temperature range of Fe-based superconductors. Statistical analysis of the transition temperature-including maximum, minimum, mean, variance, standard deviation (std), range, median, coefficient of variation, and skewness-is presented in (Table 1). The coefficient of variation is 70.03%, indicating the transition temperature has good dispersion. The skewness  is greater than zero, indicating the data greater than the mean value are more scattered than the data less than the mean value.

Model accuracy
This paper divides the 203 sets of data into 2/3 training data and 1/3 testing data at random, and trains the model. The regression analysis between the actual transition temperature and the estimated transition temperature in the course of training the model are presented in (Fig 5) with accuracy of 91.181%. It shows a reasonable accuracy and powerful generalization. The performance of the model is shown in ( Table 2). The MAE and CC are 0.47265, and 85.44%, respectively, representing closely matching performance and good prediction performance.

Model stability
To further estimate on the prediction stability, the model performance measures through the 5 predictions for observation in (Table 3). It is found that all predictions generally maintain high accuracy from the training sample. The std of the MAE, RMSE, and accuracy are 0.0245, 0.05162, and 0.7057%, meaning that prediction errors are in a controllable range and that the model has a good prediction stability.

Comparisons with previous studies
In (Table 4), the performance of our BP model is compared with that based on two other models, the RF (Random Forest) [43] and the MLR (Multi-variable Linear Regression Regression) [44], in previous studies. It is found that our BP model has a optimal performance in terms of the CC and accuracy. In addition, our BP model is more straightforward from the perspective of computations and implementations than the others.

Fe-based superconductors prediction
In order to identify the feasibility and validity of the new model, 10 Fe-based superconductors include the four kinds of Fe-based superconductors, whose transition temperature values are in a range of 4.1-53.5K, were selected from the literature [45][46][47][48][49][50] that are not included in the trained model as the data. We input the lattice parameters of every Fe-based superconductor into the model, and obtain the corresponding predictive transition temperature. The results Table 1. Statistical analysis of Tc in (Fig 4). are presented in (Table 5) and the visualization is shown in (Fig 6). and we can measure the transition temperature of Fe-based superconductors based on the lattice parameters.

Conclusion
In this paper, we used a machine learning method to predict the transition temperature of Febased superconductors based on the lattice parameters. By training BP algorithm, the acceptable accuracy of 91.181% was obtained in the model with available data. We made the performance measurement for estimating the model stability, and the model errors were in a controllable range. We used the trained model to predict, and the predictive value is close to the actual value. Those suggest that the model is capable of estimating the transition temperature of Fe-based superconductors with reasonable accuracy and therefore is recommended for predicting the transition temperature of Fe-based superconductors.