Computational Effective Fault Detection by Means of Signature Functions

The paper presents a computationally effective method for fault detection. A system’s responses are measured under healthy and ill conditions. These signals are used to calculate so-called signature functions that create a signal space. The current system’s response is projected into this space. The signal location in this space easily allows to determine the fault. No classifier such as a neural network, hidden Markov models, etc. is required. The advantage of this proposed method is its efficiency, as computing projections amount to calculating dot products. Therefore, this method is suitable for real-time embedded systems due to its simplicity and undemanding processing capabilities which permit the use of low-cost hardware and allow rapid implementation. The approach performs well for systems that can be considered linear and stationary. The communication presents an application, whereby an industrial process of moulding is supervised. The machine is composed of forms (dies) whose alignment must be precisely set and maintained during the work. Typically, the process is stopped periodically to manually control the alignment. The applied algorithm allows on-line monitoring of the device by analysing the acceleration signal from a sensor mounted on a die. This enables to detect failures at an early stage thus prolonging the machine’s life.


Introduction
There are many applications that require on-line fault diagnosis, e.g. to monitor the quality of manufactured products or to improve the reliability and safety of a system.
The literature provides many sophisticated solutions that are, in most cases, computationally demanding, hence problematic to implement in embedded devices where processing power is restricted. Moreover, many scenarios require a strictly limited processing time.
Computationally effective solutions can be implemented using simple and low-cost hardware which also cuts development time and overall cost.
The article presents a method for detecting faults by analysing the time responses. The solution was applied in monitoring a moulding process which is commonly used in industry to manufacture objects from pliable material. A moulding device is typically composed of two forms whose inner shape determine the outline of a manufactured object. At the beginning of the process, the dies are shifted towards and pressed against each other with huge force to assure no inbetween gaps. The raw material is injected into the forms which are then detached to eject the solidified object. The proper alignment of the dies is crucial to the process. When the contact planes of the dies are skewed, the impact causes uneven wear-off of the forms and their untimely deterioration. Therefore, the production process must be stopped every now and again to measure the forms' alignment and recalibrate if need be. A reliable and computationally inexpensive method for automatic detection of the dies' adjustment is desirable in industry.
The suggested approach makes use of an acceleration signal from the sensor to detect a faulty moulding device. In the application, the response signal can be considered as stationary and can also be analysed solely in the time domain by using the dot (inner, scalar) product whose computational cost is proportional to the number of samples. There is no need to employ transforms such as wavelet or Fourier, as in other approaches. This greatly reduces computation complexity and allows to use simpler and lower-priced hardware. This method does not require classifiers such as a neural network, support vector machine (SVM), etc.
The subject of controlling the dies of a moulding device is not widely described in the literature. However, similar challenges arise in different applications. The excellent paper [1] shows an interesting application whereby a stamping process is monitored for a missing part in the production line. The tonnage signal is analysed with help of the recurring plot (RP) technique [2] which is used to capture a fine deviation of a non-stationary signal. The computation of the presented two dimensional RP matrix is fairly burdensome. The classification is determined on the base of the difference between the calculated RP matrix and a reference one. The overall computation cost is proportional to the squared number of analysed samples (being around several hundreds in the considered application).
The comprehensive work [3] describes an application of detecting several faults in a stamping process. The sampled strain signal is transformed using wavelets to acquire a vector of coefficients. The approach necessitates a classifier to make a decision about the fault. The authors compare the effectiveness of hidden Markov models (HMM), artificial neural network (ANN), support vector machine (SVM), support vector regression (SVR) and a proprietary classifier. The optimal training of a classifier is a problem of its own. The approach is suitable to analyse non-stationary signals with a poor signal-to-noise ratio (SNR), however it is too computationally demanding to be implemented in a microcontroller. A similar work is also reported in [4] where a number p of autoregressive coefficients need to be computed. Thus the computation cost is p (being at least 4) times the cost of calculating the dot product. Moreover, the classifier based on HMM needs to be trained.
The approach presented in [5] analyses the tonnage signal from a stamping process with the help of wavelet transform and a fairly complex technique called statistical process control (SPC).
Similar solutions are employed to prevent the damage of mechanical systems or warn of a fault, e.g. in engines [6], pumps [7], gear-boxes [8], wind turbines [9], etc.
The typical approach to fault diagnosis is composed of the following three stages: 1. signal transformation-the output response (or responses) of a device or system is often decomposed by Fourier, wavelet [10][11][12], (short time Fourier transform) STFT, Wiegner-Ville [13], EEMD (empirical mode decomposition) [7] transforms to acquire a vector of parameters; 2. reduction of the vector size-the vector contains many irrelevant features that can be neglected to simplify the classification problem. The following methods are commonly used in fault diagnosis: principal component analysis (PCA) [14], kernel entropy analysis (KECA) [11], kernel principal component analysis [11,15], uncorrelated multilinear PCA, uncorrelated multilinear PCA [16] and others; 3. classification-the reduced vector is an input to a neural network to perform fault classification. As a neural network can produce not deterministic results [17,18] due to overlearning and local extrema, often the support vector machine (SVM) is used, as in [10,19]. The work [7] uses Bayesian network and shows its superiority over neural network or SVM in an application of monitoring a gear pump. Some works also use hidden Markov models, as in [3,4].
The above-presented methodology is general, flexible and allows to analyse signals produced by non-stationary and non-linear systems. This comprehensiveness, however, incurs computational complexity that in most cases precludes implementation in embedded devices. At the same time, there are systems that can be considered stationary and linear and whose output responses for different faults are distinguishable in the time domain. Therefore, there is no need to apply signal transform which imposes a considerable computational burden. The approach suggested in this paper does not require a complex classifier, such as a neural network, HMM, SVM, etc. The system's response is projected into the space created by the socalled signature functions, which are computed from the training responses. The location of the signal in that space determines the fault type. The properties of the presented method make the approach suitable for embedded industrial solutions where hardware cost, reliability and simplicity are important factors.

Derivation
In this section, we will develop a method for determining the system's state basing on its time response. Thus, the responses pertaining to different conditions should be distinguishable in the time domain. This implies that the analysed system should behave as stationary and linear.
First, the responses corresponding to different conditions of the system need to be acquired. This constitutes training data for the method. Let a i (t), where i ranges from 1 to A, denote the system's responses measured in its in healthy condition. Let b i (t), where i ranges from 1 to B, represent the system's responses under the first type of malfunctioning. Let c i (t), where i ranges from 1 to C, represent the system's response under the second type of malfunctioning. By analogy, we can define other functions representing the system's responses under other types of malfunctioning. The total number of states (healthy and malfunctioning) is denoted by ξ. The total number of time responses is For the sake of compactness, we arrange the training signals in the following vector of functions On the base of the training signals, we want to calculate so-called signature functions. Let aðtÞ denote the signature of the healthy state. We demand, thatâðtÞ be similar to signals a i (t) (i = 1. . .A) and at the same time unrelated (orthogonal) to signals b i (t), c i (t), . . .. Consequently, signature functionbðtÞ should be similar to signals b i (t) and orthogonal to signals a i (t), c i (t), . . .This simplifies greatly the classification problem. The current response h(t) of the system is registered and compared to signature functionsâðtÞ,bðtÞ, . . .Then, for instance, in the healthy state, h(t) bears a resemblance toâðtÞ and is orthogonal to other signature functions.
We assume that the signatures can be expressed as linear combinations of the training signals, hence can be written asâ where unknown vectors The challenge is to compute signature functions, which is equivalent to calculating vectors x a , x b , . . .so that the signatures are robust against noise and provide minimum classification error for unseen data. We will come back to this problem later.
Another issue is to measure the resemblance between two signals. For that purpose, various techniques are employed, e.g. the sum of squared differences, the sum of absolute differences, dynamic time warping.
For our purpose, we employed the dot (inner, scalar) product, which is a linear operation. Let hg(t), h(t)i denote the dot product of two functions, defined here by where (T 0 , T 1 ) denotes the time range of the relevant system's response.
Let h(t) denote the system's response. If the system is healthy then c a ¼âðtÞ; hðtÞ h i% 1; ð7Þ In other words, a signal space is spanned on the signature functions. The location of the system's response h(t) determines its state.
Signal h(t) can also be considered as a point whose coordinates in the signal plane read Given the system's response h(t) and calculated coordinates P(h(t)) in the signal space, the problem of state classification is then easy. Checking the location of P(h(t)) with reference to the decision boundaries is straightforward. An example criterion can be based on the maximum value of c a , c b , c c , . . ..
As indicated earlier, the challenge is to determine the signature functionsâðtÞ,bðtÞ, . . .that is to calculate vectors x a , x b , . . .of unknown coefficients-see Eqs (3)(4)(5). The resultant signatures should be resilient against noise and provide a minimum misclassification rate for unseen signals.
The task is similar to regression analysis, whereby a curve equation is calculated on the base of the training data. When the underlying physical model is unknown, the problem is difficult. The curve should provide a minimum error for unseen data. Overfitting the curve to the training data results in a poor fit to the unseen data-this is illustrated in Fig 2. This problem is discussed in detail in [20] (especially Examples 4.3 and 4.4) and in [21]. In the case of neural networks, this phenomenon is described as overlearning. Calculating the signature functions in the way presented below leads to overfitting to the training data. However, this is an instructive step, as it suggests a solution. Demanding forâðtÞ and using the linearity of the hÁ, Ái operator, leads to the following matrix equation where y a ¼ ½1; . . . ; 1 |fflfflffl ffl{zfflfflffl ffl} Then andâ The signature functionâðtÞ calculated in this way is perfectly fitted to the training signals. To mitigate this problem, we need to constrain the signatures to take non-zero values, solely where the signals representing different states are different.
We demand the subsequent orthogonality conditions EâðtÞ; a i ðtÞ EâðtÞ; c i ðtÞ EbðtÞ; a i ðtÞ EbðtÞ; c i ðtÞ where E{ Á } denotes expectation (average value). This constraint prevents the signature functions fit the training data perfectly. Instead, the average residual error equals 0. This mitigates where v b = [0, 1, 0, . . .] T . Vectors x a , x b , etc. cannot simply be calculated from Eqs (33 and 34) which pose an under-determined set of equations (E is a non-square matrix). We also impose the following minimum energy constraintŝ aðtÞ;âðtÞ bðtÞ;bðtÞ which have the beneficial effect of smoothing the signature functions. This phenomenon is described as "the blessing of smoothness" and improves the prediction capabilities for unseen data, as explained in [21]. This constraint plays an important role-the signatures will assume non-zero values, only where the training data corresponding to different states bear differences.
@ @x bb ðtÞ;bðtÞ Hence @ @x a Lðx a ; l a Þ ¼ 2Dx a þ 2El a ¼ 0 ð44Þ Thus, to calculate x a , the set of two matrix Eqs (33 and 44) needs to be solved. For the sake of readability, we rewrite these equations where E T is a non-square matrix of size Θ × ξ; D is a square matrix of size Θ × Θ. x a cannot be determined directly from Eq (46) by inverting the non-square matrix E T . Instead, solving for x a in Eq (47) yields Substituting this to Eq (46) and calculating λ a leads to Plugging λ a back to Eq (47) finally results in The functionâðtÞ can now be calculated from Eq (3) or from By forming an analogous Lagrangian we can calculate x b and consequentlybðtÞ and so forth.

Variance
Estimating whether the system is healthy or malfunctioning on the base of its response h(t) by the following scalar products c a ¼âðtÞ; hðtÞ h i ð 52Þ is encumbered with an error. The sources of this error are measurement noise, stochasticity (non-repeatability) of the monitored process and differences in the training responses a i (t), b i (t), etc. Thus c a , c b and so on can be treated as stochastic variables, for which we will calculate the variance. On the base of conditions (23-25), (26-28) we can calculate the unbiased estimator of the said variance where, for the sake of simplicity, the result of hâðtÞ; f ðtÞi is understood as the following vector Parameters s 2 a , s 2 b , . . . will serve as measures for the goodness of fit to unseen data.

Application
The proposed method was applied to monitor a bottle moulding process. A simplified diagram of the device is presented in Fig 5. The system is composed of a static and moveable form (die) where the latter is shifted by an actuator, till both dies meet. The forms are pressed against each other with strong force. Then, a preformed plastic bottle is blown and expanded into the inner shape of the forms.
The fastening screws provide proper alignment of the forms. Over time and due to high pressure, the fastening can become loose which results in an early degradation of the formsan exaggerated situation is shown in Fig 6. The second type of fault that occurred, albeit less important, was the loose fastening of the actuator to the body which caused additional vibration due to resonance.

Measurements
An analog piezoelectric accelerometer was placed on the moveable form and the acceleration signal along the axis of movement (horizontally) was registered. The parameters of the accelerometer were as follows: • 3 dB bandwidth: 1 Hz to 10 kHz, • acceleration range: ±100 g, • sensitivity: 50 mV/g, where g denotes the gravity of Earth, i.e. g % 9.81 m/s 2 . The analog signal from the accelerometer was amplified, filtered and sampled at 65536 Hz by an acquisition card. Due to external noise, namely other machines in the factory producing vibrations, the accuracy of the measurements was degraded. The resultant noise was measured during the machine idle time and standard deviation equalled σ n = 1.04 m/s 2 . The noise histogram is presented in Fig 7.

Signature functions
We measured 60 system's responses-20 for each state (healthy, misaligned forms, loose fastening of the actuator). We divided this set into two parts: the training set  Fig 3a, 3b and 3c show training signals measured when the system was calibrated, the forms misaligned and the actuator's fastening screws became loose, respectively. The signature functionsâðtÞ,bðtÞ andĉðtÞ of the healthy system and under two types of malfunctioning respectively, were calculated on the base of the training set f(t) and are presented in Fig 8. The verification set f v (t) is used to assess the results. Table 2 shows the results of the standard deviations of c a , c b and c c (see Eqs (54 and 55)) calculated for the verification set of functions f v (t).
For each function from the verification set f v (t), the following projections (see Eq (17) Points P(f v, i (t)) for the functions from the verification set f v (t) are visualized in Fig 9. The device's state can simply be determined by analysing the location of the P(h(t)) point whereby h(t) is the current response. Perpendicular planes can be drawn to denote the boundaries for different conditions, which in the considered case, are easily distinguishable.
The algorithm using 32-bit fixed-point arithmetic was implemented on a simple ARM7 microcontroller (AT91SAM7x [23]) clocked at 50 MHz. The dot product of the signature func-tionsâðtÞ,bðtÞ andĉðtÞ with the device response h(t) required ca 0.83 ms of the microcontroller's time. There were only around 2750 fixed point multiplications and additions.
Overfitting. The problem of overfitting was mentioned earlier during the justification of the applied assumptions. As indicated before, the overfitted signature functions are calculated (on the base of the training set f(t)) from the following equations Solving for the unknown coefficients x oa , x ob , x oc we obtain x ob ¼ D À1 y b ð64Þ  Table 3 shows the results of the standard deviations of c a , c b and c c (see Eqs (54 and 55)) calculated for the verification set of functions. The results are worse than in the previous case (wherebyâðtÞ,bðtÞ,bðtÞ signatures were used)-cf. Table 2. Table 2. Standard deviations of c a , c b and c c (for the verification set f v (t)) are calculated as σ a , σ b and σ c -see Eqs (54 and 55). By analogy, we define the projections (see Eq (17) Points P o (f v, i (t)) for the functions from the verification set f v (t) are shown in Fig 10. The P o (a v, i (t)) points are well fitted for the healthy state, however in the malfunctioning states the fit is poor and characterized by large variances-compare Tables 2 and 3. In this case, the decision boundaries cannot be represented by simple perpendicular planes. The points are more dispersed. Overfitting resulted in poor prediction for the verification responses.

Discussion
A diagnosing method for an industrial moulding process of was presented, whereby the device was under three conditions: healthy, first and second type of fault. The solution proved to be effective in the considered application, where the device could be modelled as a stationary and linear system.
The solution can be adapted for different applications and expanded to diagnose multiple faults by creating corresponding signature functions, providing the stationary and linearity requirements are met.
The strong point of this approach is its simplicity for determining the system condition. The system response is projected into a signal space to determine its state. Thus, there is no need for a classifier such a neural network, hidden Markov models (HMM), support vector machine (SVM) for which the training phase can be a complex process. Table 3. Standard deviations of c a , c b and c c (for the functions from the verification set f v (t) and overfitted signatures) are calculated as σ oa , σ ob and σ oc -see Eqs (54 and 55). Moreover, the underlying dot product operation is comparatively undemanding, as opposed to wavelet or Fourier transforms employed by other approaches. Therefore, a simple and lowcost processing unit can be applied to monitor an industrial process.
The presented solution can perform poorly when the responses are periodic with a greatly varying period or phase and also when the analysed system cannot be treated as stationary. In this case, the presented method can use results from the short time Fourier (STFT), wavelet transform, etc. to create multidimensional signature functions and benefit from a simplified classification. For example, employing STFT, 3 dimensional signature functions (amplitude, frequency and time) can be calculated. Such signatures can be robust against the above mentioned obstacles.
Another limitation occurs when one set of training functions is a scaled version of another function set. If this is the case, then an inconsistent system of equations is obtained, e.g. the following equations cannot be simultaneously satisfied:â where k is some constant different from 0. To detect these situations, the linear dependence of the training functions needs to be examined. The dimension of the signal space (i.e. number of signature functions) must be less than the total number of the system's states. The problem needs to be slightly reformulated, however can be handled by the presented method. A non-linear relationship between the faults and the output responses can compromise the solution, as the coexistence of faults does not correspond to the superposition of the responses under pertaining faulty conditions. The challenge of multiple fault diagnosis is difficult in general [24], however arises considerably less often than detecting a single failure.
Supporting Information S1 Dataset. Measurement data. MATLAB files. Sampling rate: fs = 65536 Hz; a.mat-6 training functions for healthy state; b.mat-6 training functions for misaligned dies; c.mat-6 training functions for resonance; av.mat-14 training functions for healthy state; bv.mat-14 training functions for misaligned dies; cv.mat-14 training functions for resonance. (ZIP)