## Figures

## Abstract

Modern automation systems largely rely on closed loop control, wherein a controller interacts with a controlled process via actions, based on observations. These systems are increasingly complex, yet most deployed controllers are linear Proportional-Integral-Derivative (PID) controllers. PID controllers perform well on linear and near-linear systems but their simplicity is at odds with the robustness required to reliably control complex processes. Modern machine learning techniques offer a way to extend PID controllers beyond their linear control capabilities by using neural networks. However, such an extension comes at the cost of losing stability guarantees and controller interpretability. In this paper, we examine the utility of extending PID controllers with recurrent neural networks—–namely, General Dynamic Neural Networks (GDNN); we show that GDNN (neural) PID controllers perform well on a range of complex control systems and highlight how they can be a scalable and interpretable option for modern control systems. To do so, we provide an extensive study using four benchmark systems that represent the most common control engineering benchmarks. All control environments are evaluated with and without noise as well as with and without disturbances. The neural PID controller performs better than standard PID control in 15 of 16 tasks and better than model-based control in 13 of 16 tasks. As a second contribution, we address the lack of interpretability that prevents neural networks from being used in real-world control processes. We use bounded-input bounded-output stability analysis to evaluate the parameters suggested by the neural network, making them understandable for engineers. This combination of rigorous evaluation paired with better interpretability is an important step towards the acceptance of neural-network-based control approaches for real-world systems. It is furthermore an important step towards interpretable and safely applied artificial intelligence.

**Citation: **Günther J, Reichensdörfer E, Pilarski PM, Diepold K (2020) Interpretable PID parameter tuning for control engineering using general dynamic neural networks: An extensive comparison. PLoS ONE 15(12):
e0243320.
https://doi.org/10.1371/journal.pone.0243320

**Editor: **Yanzheng Zhu,
National Huaqiao University, CHINA

**Received: **June 30, 2020; **Accepted: **October 26, 2020; **Published: ** December 10, 2020

**Copyright: ** © 2020 Günther et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **All relevant data are within the manuscript.

**Funding: **The author(s) received no specific funding for this work.

**Competing interests: ** The authors have declared that no competing interests exist.

## 1 Introduction

Modern production engineering involves increasingly complex physical processes [1]. The physical processes underlying cutting-edge production engineering cannot be appropriately expressed with simple models [2]. New, more complex classical control methods are being developed, but their increased complexity is what makes them challenging for control engineers to design and apply [3]. Limited by both the number of skilled control engineers and their cost, the production engineering industry has widely chosen to continue to use simple, understandable linear controllers. While these controllers are easy to set-up and adjust, they are not suitable for the complex non-linear behaviour of the processes they are expected to control. Using these simple controllers comes at a cost: it means processes will require close monitoring and human assistance whenever the system changes in an unforeseen way. In the face of increasingly complex systems, both control-engineer-designed methods and closely-monitored simple controllers fail to scale.

One way to bridge this gap between simple controllers and complex control systems is by applying modern machine learning techniques [4]. Extending the capabilities of well-accepted and used controllers with machine learning yields a potential solution to the lack of scalability and adaptability in existing control approaches. In this paper, we will investigate the use of neural networks to adapt the parameters of a Proportional-Integral-Differential (PID) controller not only before deployment but online during ongoing control. This continuous adaptation to the process would allow for linear controllers to perform well for any control task, as the controller constantly linearizes its behaviour around the state the system is currently in. As our investigation shows, this results in superior control performance and disturbance rejection.

The PID controller is one of the most widely used controllers [5]. A PID controller calculates its control output *u* based on the current error *e*, the error derivative and the error integrated over time . Each error measure is multiplied by its corresponding constant—*K*_{P} with the error, *K*_{I} with the integrated error, and *K*_{D} with the error derivative—and then summed. A PID controller is tuned with respect to the system to be controlled by adjusting these three constants. While this controller is simple and well understood, its advantages come at the price of limited capabilities. PID controllers perform well only on linear systems or systems that are linearized. As PID controllers are usually adjusted before deployment, they neither handle disturbances nor varying system dynamics well enough to meet the needs of modern production engineering.

Several prior studies have shown that neural networks can improve performance when used to tune the parameters of traditional PID controllers online. Simple feed-forward networks can adjust the PID parameters in multiple settings [6, 7]. More sophisticated neural networks like strictly recurrent, diagonal [8] and quasi-diagonal [9] recurrent networks have also been investigated. However, these specific design choices for the neural network’s architecture result in very specific behaviour and learning [10] and thus are unlikely to generalize well over different control environments. To evaluate whether the use of neural networks with all possible connections is applicable to control engineering tasks in general, a broad range of different control environments is required. In this paper, we therefore implement and evaluate general dynamic neural networks, thus extending the current state of the art.

Previous works have demonstrated the applicability of neural-network-based parameter tuning, but only for one well-defined control task at a time, e.g. pendulums [7], two-tank systems [11] or magnetic systems [6]. As no single approach has been applied successfully to multiple representative control engineering problems in the literature, the effectiveness of the suggested approaches is hard to estimate. For this reason, we investigate our approach on four different systems that represent the most common challenges in control engineering, resulting in an extensive comparison. By comparing the control performance on all four different tasks, we extend the current literature which focused on one problem at a time. Additionally, our simulations are closer to real-world conditions than those used in previous work, as they include disturbances and noise.

Despite their potential to improve control performance, neural networks are not widely used for parameter tuning in real-world control systems. One major barrier prevents their adoption: due to the lack of interpretability, the suggested PID parameters cannot be evaluated for their appropriateness. As a result, the control loop itself loses one of its most important properties—its stability guarantees. Previous papers that utilized neural networks for PID parameter tuning did not address the effect on input-output stability, thus ignoring a key concern in control engineering. The domain of control engineering demands interpretability to ensure system safety [12]. As neural networks introduce a black box to the system [13], their use for PID parameter tuning is not yet widely accepted. Real-world use requires the ability to check and reason about the parameter choices a neural network outputs when granted access to a PID controller’s parameters. We suggest that the use of neural networks will in fact enable designers to achieve better performance without significantly increasing the design complexity or diminishing design interpretability when deploying a controller—if combined with traditional control engineering tools like stability analysis.

To summarize our contributions: as a first contribution of this paper, we advance the use of specifically tailored neural network architectures by investigating the use of General Dynamic Neural Networks (GDNNs) for online parameter tuning in PID controllers. These neural PID controllers are evaluated on four different closed loop control engineering tasks. Each task represents a different, common challenge in control engineering, namely non-linear behaviour, unstable equilibrium, dead time, and chaotic behaviour. These applications have furthermore been used for evaluation of control at several machine learning venues, e.g. [14–16]. We compare the performance of our neural approach with a standard PID controller, which acts as a baseline, and with a system-appropriate model-based controller, which should provide the best possible performance. To evaluate the robustness of our approach, each comparison is performed with and without significant sensor noise and with and without disturbance. All controllers are evaluated quantitatively for all scenarios, making this study unique in its comprehensiveness when compared to prior work.

As a second contribution, we demonstrate how input-output stability analysis—a classic analysis that is well known in control engineering—can be used in a novel, online way to explain the effects of parameter tuning. This analysis is done for the control system, exhibiting chaotic behaviour as an example of a challenging control task. This assessment provides an explanation of when the system is stable with respect to the PID parameters, making the neural network outputs understandable to control engineers. While input-output stability is an important aspect of every closed-loop control approach, it is widely ignored in the existing literature that suggests the use of neural networks for PID parameter tuning. This paper therefore addresses key attributes of safe and interpretable artificial intelligence.

## 2 General dynamic neural network for PID parameter tuning

The aim of this work is to extend the classic PID controller framework, composed of the PID controller and the plant, with as few changes to the physical setup of the control system as possible. Maintaining the core structure of the classic PID control framework allows for easy adoption to existing industrial applications. This structure is preserved by restricting the neural network’s inputs to signals which are already available in the closed loop control setting. The closed loop control setting is depicted in Fig 1(a). The neural network’s inputs are the control system’s output *v* and the control error *e*, while the outputs are the three PID parameters to tune, i.e. *K*_{P}, *K*_{I}, and *K*_{D}. At each time step, the neural network computes the PID parameters based on the observations and passes them into the PID controller. The PID controller then uses these parameters to compute the control output *u*.

A neural network is integrated into standard closed loop control (a). The neural network receives the system output and the error as input and outputs the three PID parameters *K*_{P}, *K*_{I}, and *K*_{D}. The double-lined arrows indicate that the associated variable could be a vector, while the single-lined arrows indicate scalar variables. An example neural network (b) shows one possible set of connections. All networks have 9 neurons in three layers. The output of each input neuron is fed back into the input layer with a delay, denoted as *g*^{−1}, of one time step.

To ensure fast online computation with limited hardware, the neural networks implemented in this investigation are restricted to one hidden layer with four neurons. Furthermore, for control systems that can be described by simple differential equations increasing the number of neurons would lead to overfitting. The network architectures differ only in additional recurrent or feedback connections. An example can be seen in Fig 1(b). The figure shows a neural network, where the output of each neuron in the first layer is fed back as part of the first layer’s input with a delay, denoted as *g*^{−1}, of one time step. For all neurons, the activation function tanh is chosen, following the rationale of [17].

The standard approach for training neural networks is backpropagation [18]. Most deep learning approaches adjusts the neural network’s weights by end-to-end optimization [19]. This optimization involves formulating a loss function that describes the difference between the neural network’s outputs and the ideal outputs. From this loss, a gradient with respect to the weights is computed and propagated through the network. At each neuron, the weights are then adjusted to minimize the loss. However, in the present framework, Fig 1(a), backpropagation cannot be naively applied. In this framework, the neural network’s outputs are the PID parameters *K*_{P}, *K*_{I}, and *K*_{D}. Using standard backpropagation would therefore require knowing the ideal PID parameters at any given time.

A way to train the neural network without the ideal outputs is to numerically approximate backpropagation. In this work, we chose a numerical Levenberg-Marquard algorithm [20] to minimize the squared control error. At each time step, the Jacobian matrix, shown in Eq 1, of the neural network weights is numerically approximated using finite differences with respect to small changes in each weight. Each weight is then adapted to decrease the control error.

The Jacobian matrix of a general dynamic neural network with *p* input neurons, *q* dynamical system (plant) outputs and k weighted connections was calculated as
(1)
where is the neural network input vector, is the vector of weights which describes the network topology and are the outputs of the dynamical system (plant). According to [21], it is sufficient to calculate the partial derivatives of the systems output instead of the error function.

As the analytic calculation would result in extensive computations, it is numerical approximated using a difference equation, rather than a differential one as solution of the equation
(2)
where is the approximated Jacobian matrix, **j**_{i} is the *i*-th column of , *η* is the step size and *ϵ* is the machine precision. *ϵ* has to be calculated for each pass, using the equation
(3)
where **w** is a vector, containing the weights, and *ϵ*_{min} is the implementation data type, double precision in this implementation. The complete calculation of the Jacobian matrix can be found in Algorithm 1.

In the industrial control setting we consider in this paper, it is important that finding an appropriate network architecture, i.e. the connections and delays between neurons, does neither need sophisticated engineering nor significant time. Any approach that makes the set up too complicated would defeat the purpose of extending the existing PID framework. We therefore chose to create ten neural network architectures for each control challenge by randomly adding feedback connections and then chose the neural network that performed best. This process of finding an architecture can be made more efficient by using search algorithms [22]. For each tested neural network, the weights were initialized randomly with a mean of zero and a standard deviation of one.

Training data was collected from the differential equations, that describe each benchmark system, without any additional noise or disturbances. We followed an approach from classical control engineering by exciting the system with input signals of different lengths and amplitudes—called Amplitude Modulated Probabilistic Random Binary Signals (APRBS) [23]—to collect data samples that sufficiently describe the dynamics of the system [24]. This approach is an extension of using a Dirac impulse for system identification [25]. We collected 35, 000 data samples for each benchmark system, and split them into training, validation, and test sets with a ratio of 15%, 30%, and 55%, respectively.

**Algorithm 1 Numerical approximation for Jacobian matrix**.

1: **Input**: Dynamical system (plant) output **v**(**x**;**w**), neural network inputs **x**, weights *W*

2: **Output**: Estimate of Jacobian matrix

3: **foreach**: Weight **w**_{i} **do**

4: // Calculate *ϵ*

5: **For** *j* = 1 to *g* **do**

6: **v**_{tmp,1} ← **v**(**x**;**w**), **v**_{tmp,2} ← **v**(**x**;**w** − *ηε*(**w**_{i})) // Compute difference of **v**

7: **For** *o* = 1 to *q* **do**

8: // Backward difference

9: **Return**:

## 3 Experiments

To evaluate the performance of the neural PID controller, we use four typical control problems. Each system offers a different control challenge. Each individual system is controlled with and without noise. The noise is Gaussian white noise with a signal to noise ratio (SNR) of 20dB and corresponds to noisy measurements from sensors and is added to the system output.

To further evaluate the robustness of the controller, we inflict upon each system a suitable disturbance. The term disturbance refers to an unwanted and unexpected system input that will result in an increase of the system error. Disturbances can occur due to external influences or due to failure in the system and they are individual for each kind of system. For each system, the disturbance is increased in size from zero until only one control approach is still able to stabilize the system. Disturbances of this magnitude are then used in the experiments. These disturbances are not included in the training data for the neural PID controller. All experiments are run over 50 independent runs to ensure statistical relevance. The system differential equations are solved using the Dormand-Prince solver. To simulate real-world conditions of a discrete sample time, the controller output can be adapted every 0.01s. This intervention time is chosen to represent the limitations of real-world actuators, which cannot adjust their values on an arbitrarily small timescale. The solver is run iteratively for 0.01s, using the result of the former step as starting conditions. During each 0.01s intervention time window, the controller output *u* is kept constant.

The training is done on an Intel Core i5-4570 with a 3.2 GHz clock rate, 6 MB of shared L3 cache, 32 GB DDR3 RAM. Once learned, the neural networks run on a Raspberry Pi 3 Model A+.

### Two-tank system

The first system is a nonlinear two-tank system, as seen in Fig 2 and is described by the differential equations in [26]. The controller has access to a pump, regulating the input, while the measured output is the water level in the second tank. This system is a standard benchmark system in control theory. It corresponds to various industrial processes, e.g. bio-reactors, filtration, and nuclear power plants. There exist a number of control approaches for this system, including direct control via neural networks [27], adaptive output feedback [28], and backstepping [26], which will be used as comparison.

This nonlinear system is widely used to study nonlinear behaviour in control engineering systems. The controller can adjust the amount of water being put into the first tank. The goal is to keep the water level in the lower tank (*v*(*t*) = *x*_{2}(*t*) at the setpoint.

To evaluate the robustness of the compared control approaches, the two-tank system is disturbed continuously between *t* = 20s and *t* = 40s. As a disturbance, the controller output is set to zero, which would correspond to a drain of the water supply. The water levels in the tank are therefore independent from the control inputs for 20s. The voltage for the pump, *u*, was limited to the range [−500V, 500V] to simulate a pump appropriate to the tank dimensions.

### Inverted pendulum on a cart

The second system is a nonlinear inverted pendulum on a cart, as shown in Fig 3. The system is described by the differential equations in [29]. The control task is to stabilize the inverted pendulum at its unstable equilibrium by applying a force on the cart. The cart’s movement is restricted to 0.5m in either direction. This system is a widespread benchmark system in control theory due to its nonlinearity and unstable equilibrium [29, 30]. Practical applications for inverted pendulums include rocket control during initial stages of flight or keeping a walking robot in an upright position. For comparison, a linear–quadratic regulator (LQR) [31, 32] and a double PID controller [32] are used.

This system is a nonlinear system with an unstable equilibrium. The control task is to move the cart to a predefined position, while keeping the pole up.

The system is disturbed by a force of 8.5N to the pendulum at time *t* = 10s. This disturbance can be interpreted as a strong and unexpected wind condition during the launch of a rocket. The controller ouptut *u* was bounded within [−50N, 50N], which corresponds to a typical actuator of that size.

### System with non-negligible time delay

The third system is a first order linear time invariant (LTI) system with a non-negligible time delay. Time delay is a problem in control theory that is often forgotten while designing controllers [33]. Time delays can result in decreased performance and system instability. The benchmarks for this system are a PID controller [34] and a smith-predictor [35]. Fig 4 demonstrates the delayed system response for an input.

The Figure shows the systems response (including the time delay *T*_{D}) to an input.

This system is disturbed by a (dimensionless) disturbance of −5 between *t* = 50s and *t* = 75s continuously. Such a disturbance can be thought of as a temporary blockage in a fluid transport system. The exact system specifications can be found in [34]. The controller output *u* was bounded between the range of [−10, 10].

### Chaotic thermal convection loop

The fourth system is a chaotic thermal convection loop, as shown in Fig 5. Its dynamics are described by the equations
(4)
with *p* = 10 and *β* = 6 as appropriate constants [36]. *x*_{1} is a measure for how far the current flow velocity differs from the steady point of the system—if *x*_{1} is zero, the system is in its steady point. *x*_{2} and *x*_{3} are measures for the difference in temperature between the points A and B, as well as C and D in Fig 5 respectively. *u* is a the power, applied to the heater and the control variable.

This system is an example for chaotic behaviour. The control task is to maintain a constant flow in the inner torus—the flow is measured at the points A and B. Half of the torus that contains the fluid is surrounded by a heating element, the other half is surrounded by a cooling jacket. The control variable is the heating power, applied on the lower half of the torus.

Chaotic behavior may lead to vibrations, oscillations and failure in systems and is therefore an important aspect of control theory. As chaotic behavior is unpredictable, mathematical models are only sufficient to a certain point, hence closed loop control is a desirable approach [37]. Usual control approaches for the chaotic thermal convection loop are nonlinear feedback controllers [38] and backstepping [39, 40].

To evaluate the robustness of the applied control approaches, the system is disturbed with a force of 100W continuously between *t* = 5s and *t* = 5.5s. This perturbation can be interpreted as a temporary change in the cooling water temperature. To simulate real actuators with a limited capacity, the controller output *u* is limited between [−100W, 100W] to simulate an appropriate heating element.

## 4 Results and discussion

The results for all experiments can be found in Table 1. For each system the mean and variance from 50 independent runs are shown for all controllers in all tested scenarios. The best control approach is highlighted in bold. For all values, a two-sample t-test was performed and a control approach is only considered to be superior for *p* < 0.05. From Table 1, it can be seen that the neural PID controller performs best in 12 scenarios, pairs with the standard PID controller in one and performs second best in two scenarios. To compare control approaches, there are common measurements that are used in control engineering, e.g. rise time, overshoot, settling time [5]. However, these measurements all address the error between the setpoint and the actual systems output with different emphases. We therefore used the root-mean-squared-error (RMSE) between the setpoint and the system output to summarize these error measures in a single number without losing information.

### Two-tank system

For the two-tank system, a backstepping controller is chosen for a comparison, as this approach takes the nonlinear behavior of the system into account and has demonstrated to be well suited for this system [26]. The PID controller is parameterized with the constants *K*_{P} = 3.65, *K*_{D} = −2 and *K*_{I} = 0.4. All controllers are able to control the system, while the neural PID controller exhibited the smallest error for all scenarios. However, the advantage the neural PID controller yields is relatively small, as can be seen in Table 1. As this system is the easiest to control, it can be expected that the standard PID controller and backstepping perform on a similar scale.

### Inverted pendulum

The inverted pendulum on a cart is controlled by a standard PID controller stack [29], and a LQR [32] for benchmarking. The PID controller, responsible for the position, has the values are *K*_{P} = −2.4, *K*_{D} = −0.75 and *K*_{I} = −1 and the controller for the angle is set to *K*_{P} = 25, *K*_{D} = 3 and *K*_{I} = 15 [29]. All three approaches are able to initially stabilize the system. For the scenario without disturbance, the neural PID controller performs equally to the standard PID controller when there is no noise present. For the scenario with only noise, the neural PID controller is the best control approach. Furthermore, only the neural PID controller is able to stabilize the disturbed system, resulting in a substantially lower error as can be seen in Table 1.

### System with non-negligible time delay

For the system with non-negligible time delay, the standard PID controller is set to *K*_{P} = 1.5, *K*_{D} = −0.1 and *K*_{I} = 0.7. The neural PID controller is superior in all scenarios, when compared to the standard PID controller. When compared to the Smith predictor [41], the neural PID controller performs better only for the scenarios without noise. However, as the Smith predictor has knowledge about the exact time delay, it has a significant advantage over the neural PID controller.

### Chaotic thermal convection loop

The PID parameters yielding the lowest error for the chaotic thermal convection loop are *K*_{P} = 25.3, *K*_{D} = 8.9 and *K*_{I} = 0—the controller is therefore a PD controller. The system is initialized outside of its inherently stable region (region of attraction) with the initial conditions *x*_{1} = *x*_{2} = *x*_{3} = 5. Without control, the system will therefore not converge to the desired steady state *x*_{1} = *x*_{2} = *x*_{3} = 0.

All controllers are capable of stabilizing the system, as can be seen in Fig 6(a). The backstepping approach has the least overshoot but takes a long time to reach the steady state. The standard PID controller is more aggressive, resulting in a higher overshoot but still a smaller error. The neural PID controller performs best as it finds a good balance between settling time and overshoot.

Subfigure (a) shows the setpoint (*x*_{1} = 0, which corresponds to a steady flow) and the system output for all controllers. Subfigure (b) shows the controller outputs for all three controllers. In subfigure (c), the PID parameters, applied by the neural network are shown.

Out of four scenarios, the neural PID controller demonstrates superior control performance. Only in the scenario with noise but without disturbance does backstepping perform slightly better (0.89 vs 0.9). This can be explained by backstepping being designed using the differential equations. It therefore knows the underlying systems dynamics and is less influenced by the sensor noise.

Between the time *t* = 5s and *t* = 5.5s, the control output is set to *u* = −100W to simulate the disturbance described earlier. The standard PD controller becomes meta stable and its controller output iterates between the maximum value of 100W and the minimum value of −100W. Although backstepping is proven to be globally, asymptotically stable in the Lyapunov sense [42], it also becomes meta stable. This can be explained by the real world conditions. As the controller can change its control output only every 0.01s the backstepping approach fails, resulting in switching inputs between the maximum value and the minimum value, as seen in Fig 6(b). Both controllers (PID and backstepping) use excessive amounts of energy without being able to stabilize the system.

The neural PID controller is able to stabilize the system after the disturbance. Fig 6(c) shows how the neural network changes the PID parameters in response to the system output. When *x*_{1} is far from the setpoint, the *K*_{P} parameter has a high absolute value to force the system towards its steady state. To further increase the controller output at *t* = 7.9s, where the system reaches its furthest distance from the set point *K*_{I} is increased. After the system reaches its steady state again, all PID parameters are adjusted back to their stationary value to ensure asymptotic performance. The neural PID controller furthermore uses significantly less energy to control the system.

### Investigating the solutions stability

Despite the experimental evidence that suggests the enormous benefit of using neural networks to adapt PID parameters online, this approach is not yet used on real-world system. This is due to the black box character of neural networks and the stringent safety requirements for control processes. One of the most important safety requirements of a closed loop control approach is input-output stability. It describes whether the system output is bounded for all bounded inputs. A system can be evaluated for stability by analysing the closed loop transfer function, i.e. the relation between the system output to its input.

For a bounded-input bounded-output stability analysis, the closed loop transfer function is computed as
(5)
where *V*(*s*) is the system output, *V**(*s*) is the setpoint, G(s) is the system transfer function and H(s) is the controller transfer function in the Laplace domain. Following the Nyquist criterion, the system is stable if all poles are in the left half of the left half plane, i.e. their real values are smaller than zero. In the example of the chaotic thermal convection loop, the systems transfer function *G*(*s*), linearized around a steady state *x*_{1} = *x*_{2} = *x*_{3} = 0, does not change over time and only has to be computed once. The controller transfer function *H*(*s*) is dependent on the PID parameters and therefore changes at every time the neural network adjusts these parameters. To make sense of these changes and interpret them from a stability perspective, the controller transfer function therefore needs to be computed at every time step. This varies from the traditional stability analysis, which is computed once under the assumption of non-changing PID parameters. Together, these transfer functions express whether the closed-loop solution is stable or unstable.

As an important contribution, we therefore perform an online analysis of the input-output stability for the controller. This analysis can be seen in Fig 7. The Figure shows the systems output and the stability, with respect to its linearized steady state over the experiment and the real values for all four poles of the transfer function. The closed loop transfer function is not stable in the beginning, during settling and after the disturbance. This can be expected, as the system is far away from its steady state for which stability is evaluated. However, as the systems output gets close to the set point, i.e. the steady state, the closed loop transfer function becomes stable. Knowing about the relationship between chosen PID parameters and stability allows to include this knowledge into the training. A potential way to include this information would be to include the poles as a regularization term during training in order to force the system towards an input-output stable behaviour. Furthermore, the input-output stability evaluation is an important insight for control engineers and makes the neural PID controllers understandable for humans, thus emphasising its applicability for safety critical systems.

The dashed line shows the systems output, when controlled by the neural PID controller. The closed loop transfer function is not guaranteed to be stable within the grey areas, despite the algorithm stabilizing the system. As the system approaches its steady state, the system becomes input-output stable with the chosen PID parameters. The second subplot shows the real values for all four poles. The system is only stable (white background colour) if all poles have real values smaller than zero.

## 5 Future work

This paper presents a first step towards accommodating the needs of control engineers when integrating machine learning algorithms into existing control architectures. While we identified a way to relate the parameters applied by the neural network back to input-output stability, this new information was not yet leveraged during the training procedure. It is a natural extension of this paper to use the newly found information about stability as a regularizer when training the neural network to ensure only input-output stable PID parameters. Beyond extensions of the exact framework used in this paper, we have provided an effective demonstration of the more general idea of leveraging machine learning algorithms to enhance existing control methods.

While other examples exist in the literature (e.g., the combination of neural networks [43] or fuzzy logic systems [44] with backstepping), many other combinations of different machine learning approaches with other control algorithms have the potential to provide good results. For example, the machine learning methods from reinforcement learning might be well-applied to linear quadratic regulators while maintaining interpretability.

## 6 Conclusion

In this paper, we conduct an extensive and rigorous investigation into the use of general dynamic neural networks for online PID parameter adaption. We perform experiments on four different systems, with and without sensor noise as well as with and without disturbance, resulting in 16 experiments in total. These scenarios cover the most important challenges in control engineering. This study is therefore unique in its extensiveness, as previous papers only used one type of benchmark system. The neural-network-based approach outperforms a standard PID controller in 15 of 16 scenarios and outperforms a model-based controller in 13 of 16 scenarios.

These results showcase the potential of extending existing systems by machine learning in general and neural networks in particular. Furthermore, we keep the neural network design and integration simple to allow for easy adoption of our technique. With an appropriate implementation as a library, our technique could be used without extensive knowledge of either control engineering or neural networks. To the best of our knowledge, this is the first investigation that uses general dynamic neural networks, extending the state of the art for using neural networks to tune PID parameters. We perform a detailed analysis for one representative scenario, highlighting the superior control performance of our approach over both the traditional PID controller and model-based backstepping. It is worth noticing that while training data was gathered from simple differential equations, the results indicate significantly increased resilience towards noise and unforeseen disturbances.

Although the significant potential of neural networks for PID parameter tuning is known [7, 8], this technique has not been used in real-world applications to date. As the functioning of a neural network in this setting is not understood, control engineers refrain from using them. In a first attempt to solve this problem, we perform an input-output stability analysis to interpret how neural networks function within the suggested framework. Tying the neural network outputs back to stability makes this neural-network-based approach understandable to humans. We therefore address a key issue when applying machine learning algorithms for control problems: the interpretability of and subsequently the trust in the machine learning solution. This work is thus an important step to increase the acceptance of machine learning based approaches for real-world systems and an important step towards safe and interpretable applied artificial intelligence.

## Acknowledgments

The authors also thank Omid Namaki, Matthew Schlegel, Ahmed Shehata, Brian Tanner, and Nadia M. Ady for suggestions and helpful discussions.

## References

- 1. ElMaraghy W, ElMaraghy H, Tomiyama T, Monostori L. Complexity in engineering design and manufacturing. CIRP annals. 2012;61(2):793–814.
- 2. Åström KJ. Limitations on control system performance. European Journal of Control. 2000;6(1):2–20.
- 3. Klatt KU, Marquardt W. Perspectives for process systems engineering—Personal views from academia and industry. Computers & Chemical Engineering. 2009;33(3):536–550.
- 4. Wuest T, Weimer D, Irgens C, Thoben KD. Machine learning in manufacturing: advantages, challenges, and applications. Production & Manufacturing Research. 2016;4(1):23–45.
- 5. Ang KH, Chong G, Li Y. PID control system analysis, design, and technology. IEEE Transactions on Control Systems Technology. 2005;13(4):559–576.
- 6. de Jesús Rubio J, Zhang L, Lughofer E, Cruz P, Alsaedi A, Hayat T. Modeling and control with neural networks for a magnetic levitation system. Neurocomputing. 2017;227:113–121.
- 7. de Jesús Rubio J. Discrete time control based in neural networks for pendulums. Applied Soft Computing. 2018;68:821–832.
- 8.
Zhang Mg, Wang XG, Li Wh. The Self-Tuning PID Decoupling Control Based on the Diagonal Recurrent Neural Network. In: Proceedings of the International Conference on Machine Learning and Cybernetics; 2006. p. 3016–3020.
- 9.
Zhang K, An X. Design of Multivariable Self-Tuning PID Controllers via Quasi-diagonal Recurrent Wavelet Neural Network. In: Proceedings of the International Conference on Intelligent Human-Machine Systems and Cybernetics. vol. 2; 2010. p. 95–99.
- 10. Bergstra J, Yamins D, Cox DD. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. JMLR. 2013;.
- 11. Pirabakaran K, Becerra VM. PID autotuning using neural networks and model reference adaptive control. IFAC Proceedings Volumes. 2002;35(1):451–456.
- 12. Leveson N, Daouk M, Dulac N, Marais K. A systems theoretic approach to safety engineering. Dept of Aeronautics and Astronautics, Massachusetts Inst of Technology, Cambridge. 2003; p. 16–17.
- 13.
Yang Z, Zhang A, Sudjianto A. Enhancing explainability of neural networks through architecture constraints. arXiv preprint arXiv:190103838. 2019;.
- 14.
Yu AW, Ma W, Yu Y, Carbonell J, Sra S. Efficient structured matrix rank minimization. In: Advances in Neural Information Processing Systems; 2014. p. 1350–1358.
- 15.
Berkenkamp F, Turchetta M, Schoellig A, Krause A. Safe model-based reinforcement learning with stability guarantees. In: Advances in Neural Information Processing Systems; 2017. p. 908–918.
- 16.
Parmas P, Rasmussen CE, Peters J, Doya K. PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos. In: International Conference on Machine Learning; 2018. p. 4062–4071.
- 17.
Kalman BL, Kwasny SC. Why tanh: choosing a sigmoidal function. In: Proceedings of the International Joint Conference on Neural Networks. vol. 4. IEEE; 1992. p. 578–581.
- 18.
Rumelhart DE, Durbin R, Golden R, Chauvin Y Backpropagation: The basic theory. In: Backpropagation: Theory, architectures and applications; 1995. p. 1–34.
- 19.
Glasmachers T. Limits of End-to-End Learning. In: Proceedings of the Asian Conference on Machine Learning. 2017. p. 17–32.
- 20. Marquardt DW. An algorithm for least-squares estimation of nonlinear parameters. Journal of the society for Industrial and Applied Mathematics. 1963;11(2):431–441.
- 21. De Jesus O, Hagan MT. Backpropagation algorithms for a broad class of dynamic networks. IEEE Transactions on Neural Networks. 2007;18(1):14–27 pmid:17278458
- 22. Sandner C, Günther J, Diepold K. Automated optimization of dynamic neural network structure using genetic algorithms. Technical Report, 2017.
- 23. Deflorian M, Zaglauer S. Design of experiments for nonlinear dynamic system identification. IFAC Proceedings Volumes. 2011;44(1):13179–13184.
- 24.
Günther J. Machine intelligence for adaptable closed loop and open loop production engineering systems [Dissertation]. Technische Universität München. München; 2018.
- 25. Åström KJ, Eykhoff P. System identification—a survey. Automatica. 1971;7(2):123–162.
- 26. Pan H, Wong H, Kapila V, de Queiroz MS. Experimental validation of a nonlinear backstepping liquid level controller for a state coupled two tank system. Control Engineering Practice. 2005;13(1):27–40.
- 27.
Majstorovic M, Nikolic I, Radovic J, Kvascev G. Neural network control approach for a two-tank system. In: Proceedings of the International Conference on Neural Network Applications in Electrical Engineering. IEEE; 2008. p. 215–218.
- 28.
Takagi T, Mizumoto I, Tsunematsu J. Performance-driven adaptive output feedback control and its application to two-tank system. In: Proceedings of the International Conference on Advanced Mechatronic Systems. IEEE; 2014. p. 254–259.
- 29. Wang JJ. Simulation studies of inverted pendulum based on PID controllers. Simulation Modelling Practice and Theory. 2011;19(1):440–449.
- 30. Kim S, Kwon S. Nonlinear Optimal Control Design for Underactuated Two-Wheeled Inverted Pendulum Mobile Platform. IEEE/ASME Transactions on Mechatronics. 2017;22(6):2803–2808.
- 31.
Jaleel JA, Francis RM. Simulated annealing based control of an Inverted Pendulum System. In: Proceedings of the International Conference on Control Communication and Computing. IEEE; 2013. p. 204–209.
- 32.
Li W, Ding H, Cheng K. An investigation on the design and performance assessment of double-PID and LQR controllers for the inverted pendulum. In: Proceedings of the International Conference on Control; 2012. p. 190–196.
- 33.
Wang C, Fu W, Shi Y. Fractional order proportional integral differential controller design for first order plus time delay system. In: Proceedings of the International Conference on Control and Decision Conference. IEEE; 2013. p. 3259–3262.
- 34.
Tavakoli S, Tavakoli M. Optimal Tuning of PID Controllers for First Order Plus Time Delay Models Using Dimensional Analysis. In: Proceedings of the International Conference on Control and Automation, 2003; 2003. p. 942–946.
- 35. Stojic MR, Matijevic MS, Draganovic LS. A robust Smith predictor modified by internal models for integrating process with dead time. IEEE Transactions on Automatic Control. 2001;46(8):1293–1298.
- 36. Creveling H, De Paz J, Baladi J, Schoenhals R. Stability characteristics of a single-phase free convection loop. Journal of Fluid Mechanics. 1975;67(1):65–84.
- 37. Wang Y, Singer J, Bau HH. Controlling chaos in a thermal convection loop. Journal of Fluid Mechanics. 1992;237:479–498.
- 38.
Boskovic DM, Krstic M. Global stabilization of a thermal convection loop. In: Proceedings of the International Conference on Decision and Control. vol. 5. IEEE; 2000. p. 4777–4782.
- 39. Vazquez R, Krstic M. Explicit integral operator feedback for local stabilization of nonlinear thermal convection loop PDEs. Systems & Control Letters. 2006;55(8):624–632.
- 40. Vazquez R, Krstic M. Boundary Observer for Output-Feedback Stabilization of Thermal-Fluid Convection Loop. IEEE Transactions on Control Systems Technology. 2010;18(4):789–797.
- 41. Smith OJM. Closed control of loops with dead time. Chemical Engineering Progress. 1957;53(5):217–219.
- 42. Mascolo S, Grassi G. Controlling chaotic dynamics using backstepping design with application to the Lorenz system and Chua’s circuit. International Journal of Bifurcation and Chaos. 1999;9(07):1425–1434.
- 43. Lui L, Gao T, Liu Y-J, Tong S. Time-varying asymmetrical BLFs based adaptive finite-time neural control of nonlinear systems with full state constraints. IEEE/CAA Journal of Automatica Sinica. 2020;7(05):1335–1343.
- 44. Liang H, Guo X, Pan Y, Huang T. Event-Triggered Fuzzy Bipartite Tracking Control for Network Systems Based on Distributed Reduced-Order Observers (Revised manuscript of TFS-2019-1049). IEEE Transactions on Fuzzy Systems. 2020.