Extended-Kalman-filter-based dynamic mode decomposition for simultaneous system identification and denoising

A new dynamic mode decomposition (DMD) method is introduced for simultaneous system identification and denoising in conjunction with the adoption of an extended Kalman filter algorithm. The present paper explains the extended-Kalman-filter-based DMD (EKFDMD) algorithm which is an online algorithm for dataset for a small number of degree of freedom (DoF). It also illustrates that EKFDMD requires significant numerical resources for many-degree-of-freedom (many-DoF) problems and that the combination with truncated proper orthogonal decomposition (trPOD) helps us to apply the EKFDMD algorithm to many-DoF problems, though it prevents the algorithm from being fully online. The numerical experiments of a noisy dataset with a small number of DoFs illustrate that EKFDMD can estimate eigenvalues better than or as well as the existing algorithms, whereas EKFDMD can also denoise the original dataset online. In particular, EKFDMD performs better than existing algorithms for the case in which system noise is present. The EKFDMD with trPOD, which unfortunately is not fully online, can be successfully applied to many-DoF problems, including a fluid-problem example, and the results reveal the superior performance of system identification and denoising.


Introduction
Recently, modal decomposition [1] for fluid dynamics has attracted attention from the viewpoints of data reduction, data analysis, and reduced-order modeling of complex dataset. This is one method for data-driven science in fluid dynamics. The most conventional method of modal decomposition is a proper orthogonal decomposition (POD), [2,3] which is also called principal component analysis (PCA) and Karhunen-Loéve expansion. The standard POD can be computed by singular value decomposition (SVD), and this fact explains that the obtained modes are orthogonal with respect to each other. Proper orthogonal decomposition modes can be computed by snapshots of fluid data and can be used for both numerical and experimental approaches. Based on POD modes, a reduced-order model can be constructed with the Galerkin projection method for instance, although only a numerical approach can be used for reduced-order modeling in this way. Another conventional method is global linear stability analysis (GLSA), [4][5][6] which shows that the eigenmodes of the system of linearized governing equations (i.e., the Navier-Stokes equations for most of the fluid problems) around the steady state of nonlinear dynamics. Here, GLSA shows the most unstable eigenmodes and judges whether the steady-state solution is stable. The modes obtained by GLSA are a solution of the original linearized equations, although this method always requires numerically complex approaches and cannot be applied to experimental data. Unlike POD modes, the modes obtained by GLSA are not orthogonal unless otherwise the system is written with an Hermite operator.
In recent decades, a new method, dynamic mode decomposition (DMD), [7] has been proposed and developed as a data-driven science method and has been applied to numerous fluid problems. [8][9][10][11] Here, DMD has characteristics of both POD and GLSA, whereas DMD can be computed only by a time-series of snapshots of numerical and experimental data. This method processes snapshots of sequential unsteady nonlinear flow fields and yields eigenvalues and corresponding eigenmodes for the case in which the dataset is assumed to be explained by a linear system x k+1 = Ax k , where x k is the kth snapshot of sequential data and A is a system matrix. These dynamic modes are generally nonorthogonal, and each mode possesses a singlefrequency response with amplification or damping as a natural characteristic of a linear system expression, which leads to a more intrinsic understanding of the role of each mode. Thus far, there are several methods by which to compute the dynamic modes: standard DMD [7], exact DMD, noise-cancelling DMD (ncDMD), [12] forward-backward DMD, (fbDMD), [12] total least-squares DMD (tlsDMD), [12,13] online DMD, [14] and Kalman-filter-based DMD (KFDMD), [15] where ncDMD, fbDMD, tlsDMD and KFDMD focus on the noisy dataset. The standard DMD and the exact DMD adopt SVD and a Moore-Penrose pseudo-inverse matrix for low-rank approximation of the matrix A, respectively. This implies that these algorithms compute dynamic modes as a kind of least-squares problem. A robust method for a noisy dataset, tlsDMD, adopts a truncated POD for pair data and successfully increases the accuracy of obtained dynamic modes. A recent KFDMD is written in the form of system identification using the Kalman filter algorithm [16] and can be optimized based on the prior knowledge of the noise superimposed on the data. This is different from the usage of the Kalman filter in Reference [17,18] in which the Kalman filter is used for data reconstruction and prediction.
However, the application of DMD to noisy data and the denoising process are still limited. For example, tlsDMD has been developed for accurately estimating the dynamic modes and corresponding eigenvalues, but a method by which to reconstruct the data has rarely been shown except for the data reconstruction using first snapshot, [19] which is conventionally adopted. If we adopt the conventional simple estimation of initial amplitudes to reconstruct the data, then the data is greatly affected by the noise on the initial data, as expected. One of a few advanced data reconstruction methods is use of Kalman filter for linear system that is corresponding to the Koopman operator after the linear system is estimated. [17,18] Optimized DMD (optDMD), [20] and the combination of tlsDMD [12,21] and the sparsity-promoting DMD (spDMD) [22] could be used for the denoising of noisy data. Here, optDMD gives us the dynamic modes and eigenvalues and corresponding initial values that best fit the noisy time series data under the assumption of no system noise. On the other hand, spDMD [22] selects finite-number modes for the reconstruction of flow fields considering the L 0 or L 1 norm of regularization terms, as is often used in sparse modeling and compressed sensing. These methods are very useful for reconstructing flow fields, but the reconstructed data are governed by the initial value of the strength of each mode and possibly cannot handle the change in phase of dynamic modes in long-time data due to the system noise including modeling error, nonlinear processes, or unexpected events in the experiments. Furthermore, the optDMD requires fitting of all of the data, and the combination procedure of tlsDMD and spDMD requires two-step computation. At present, an online method for simultaneous system identification and denoising using the DMD framework has not yet been proposed.
In the present paper, a new method for simultaneous system identification and denoising using the DMD framework is proposed using the extended Kalman filter. In addition to the system identification of the previously proposed KFDMD, [15] the observed data are simultaneously filtered online for dataset with a small number of degree of freedom (DoF). The present paper first explains the algorithm of the proposed extended-Kalman-filter-based DMD (EKFDMD). The drawback of the computational costs of EKFDMD is addressed, and combination with a truncated POD (trPOD) is proposed for reduction of the computational cost, though it prevents the algorithm from being fully online. Finally, the proposed method is applied to various problems and its performance is illustrated.

Problem settings
Here, the previous algorithms compared in the present study are briefly explained. For the extension in the next subsection, the linear system model is assumed for the time series dataset as follows: Here, A, x, y, v, w and a subscript k are the system matrix, the state variable vector, the observation vector, the system noise, the observation noise, and the time step respectively. Here, the dimension of the state and observed variables is set to be n. Moreover, x k is assumed to be the true value. Usually, we can only access y in the present paper, though x has been used as the observation vector in the previous DMD studies. Therefore, the reader should take care when considering the notation used herein. First, three methods, DMD, tlsDMD, and KFDMD are briefly explained in Subsections 2.2, 2.3, and 2.4, respectively, and a conventional data reconstruction method for these algorithms is introduced in Subsection 2.5. Finally, optDMD, which is a state-of-art offline algorithm for both estimating the dynamic modes and reconstructing data, is explained in Subsection 2.6.

DMD
The m-sample observation data matrix including observation noise is defined as follows: whereas y k = x k if the observation noise is absent. The original DMD is performed with SVD for Y 1:m−1 as follows: matrix is obtained as follows: In this case, the projected r × r matrixÃ of the matrix A onto the low-dimensional space can be obtained as follows:Ã ¼Ũ T 1:mÀ 1 Y 2:mṼ 1:mÀ 1S Then, the eigendecomposition is carried out: Here, W DMD are the eigenvectors, and Λ DMD is the diagonal matrix with the eigenvalues. Using W DMD , the dynamic mode matrix in the original space is recovered: Here, F contains the dynamic mode vectors as follows:

tlsDMD
For total least-squares DMD, the pair snapshot is considered. In this case, trPOD data or raw data can be used. [12,13] In the present study, raw data are directly used as in the original code. [23] The procedure for the time series data are as follows. First, define a pair data matrix: and POD is applied to the pair data matrix above: Then, we obtain an r-rank truncated pair POD, as follows: Here, we obtain a snapshot pair of POD projectionsŶ 1:mÀ 1 andŶ 2:m of Y 1:m−1 and Y 2:m as follows:Ŷ Using these matrices,Ã is computed by SVD ofŶ 1:mÀ 1 : The dynamic mode and eigenvalue estimations are estimated exactly in the same way as DMD in Eqs 7 to 9.

KFDMD
The components of the matrix A are considered to be state variables of the Kalman filter. The state variable vector θ KF are written as follows: Using the state variable vector described above, the system and observation equations can be written as follows: where H KF k is the following observation matrix defined as follows: zffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl }|ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl { n 2 dimensions n dimensions ð20Þ Note that we have the following relationship: where A k represents the estimation of A in the kth time step. Here, v k and w k are system and observation noises, respectively. Using the state equation given above, the linear Kalman filter is constructed with the fast algorithm shown in Reference. [15] After obtaining the matrix A, the dynamic mode and corresponding eigenvalues are obtained through the eigendecomposition of the matrix A.

Data reconstruction using DMD, tlsDMD, and KFDMD
The DMD, tlsDMD, and KFDMD methods only estimate the matrix A and do not estimate the reconstructed time series data using dynamic modes. In a conventional method [19] of reconstruction, we assume that the data can be reconstructed as follows: Here, X reconst is the reconstructed data matrix, B 0 is a diagonal matrix of the initial amplitudes ; ð23Þ and V and is a Vandermonde matrix representing the temporal behaviors of dynamic modes while assuming the system noise to be absent: : ð24Þ The initial value vector b 0 , which is defined as can be obtained using the pseudoinverse of F, as follows: where the plus symbol superscript denotes the Moore-Penrose pseudoinverse matrix. As discussed later herein, y 0 includes the observation noise superimposed on the initial snapshot, and this reconstruction does not work well due to this noise, even if the eigenvalues are well estimated.

optDMD
In the optimized DMD, [20] the following problem is solved: Although there are several ways to solve this nonlinear problem above, the variable projection method is adopted in the present study. In this case, the best-fit reconstructed data matrix is obtained under the assumption that system noise is absent. In the case of spDMD, F and Λ are fixed using another DMD method, and optimum sparse b 0 is solved while adding the L 1 or L 0 regularization term of b 0 . The original code [24] is employed in the present study.

Algorithm
As introduced in the section above, we consider the system expressed by Eqs 1 and 2. For simplicity, we introduce Einstein summation convention for Eqs 1 and 2, as follows: Then, the Kalman filter algorithm is considered. In this problem, we would like to simultaneously conduct the online system identification and denoising of the observed variable when a number of DoF is small. Therefore, the observed variables and elements of the matrix A are chosen as state variables of the considered system. The state variable vector θ is defined as follows: . . .
Using these state variables, the system transient can be written as follows: where the v k and w k are the system and observation noise, respectively, and the nonlinear function f and the observation matrix are expressed as follows: . . .
The upper half of the system is written as the multiplication of state variables x j and a ij , and, as such, the system is considered to be nonlinear. The lower half of the system corresponds to the constant or slowly varying system coefficients to be identified and does not change explicitly. For the construction of the extended Kalman filter, the linearization is required. The Jacobian matrix F of a nonlinear function f of the state variables θ is calculated as follows: zffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl }|ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl { n þ n 2 dimensions n þ n 2 dimensions ð35Þ Using matrices F k and H, the extended Kalman filter can be constructed for the nonlinear system. Note that F k is a time-varying matrix.
Following the theory of a Kalman filter, a priori prediction of a state variable vector θ k and a covariance matrix P k|k−1 can be achieved using the state variable vector θ k and covariance matrix P k−1|k−1 from the previous time step, where the system matrix F k is expressed by Eq 35, and Q is a covariance matrix of the system noise.
When a new observation is available, the state variables and the covariance matrix are updated using the Kalman gain, which is computed as where S k is a noise covariance matrix and is expressed as follows: Here, R k is a covariance matrix of observation noise w k . A modification vector for state variables θ is computed as follows: Finally, the state variable vector and the covariance matrix after the observation are updated as follows: This extended Kalman filter requires the multiplication of the large matrix of dimension of (n 2 + n) × (n 2 + n), as discussed in Section 5. This is a clear drawback of this formulation for many-degree-of-freedom (many-DoF) problems, and using this algorithm together with trPOD is recommended, as explained in Section 3.2. This drawback of EKFDMD is the same as that of KFDMD designed for only the system identification, though the drawback of KFDMD is somehow relaxed owing to the fast algorithm proposed in the previous study, [15] in which the large matrix is assumed to be decomposed into several identical block matrices. Although we attempt to use a concept similar to the previous KFDMD, [15] we could not find a similar method for EKFDMD in the present state. Therefore, the computational cost for EKFDMD is severer than that for KFDMD designed for only system identification, and the use of the present algorithm together with trPOD is strongly recommended for many-DoF problems.
It should be noted that, in the early implementation of EKFDMD, we employed the several initial time steps for only the estimation of A without filtering of x, but they are found to just degrade the results. In the present implementation, the simultaneous estimation is impulsively started from the first step.

Combination with a truncated POD
As discussed in the previous section, the computational cost of the present algorithm is high, and, therefore, a truncated POD (truncated SVD) should be used for the reduction in the number of DoFs of the dataset of the observed variables. Similar to a previous study on KFDMD for only system identification, the obtained data are processed as follows: 1. the batch POD is applied, 2. a proposed Kalman filter is then applied to the amplitude of each POD mode, and 3. the mode shape of a fluid system is finally recovered by multiplying the spatial POD modes.
As the first step (step 1), POD is applied to an observed data matrix and an observed data matrix is expressed in SVD form as follows: Here, U and V are matrices consisting of the spatial and temporal POD modes, respectively. The r-rank approximation of the observed data matrix is calculated as follows: where quantities with tildes indicate r-rank approximations. Here, the r-dimension matrix of S consists of r-largest singular values of S. In addition, the row vectors ofŨ andṼ are the same as the corresponding first r row vectors of U and V. Using these matrices, reduced-order Y , which represents mode strength, is constructed as follows: In the second step (step 2),Ỹ andỹ k are treated in a manner similar to Y and y k in the proposed EKFDMD procedures, and x k and A are simultaneously estimated once. In addition, for online implementation,ỹ can be used where the left singular vector is assumed to be fixed using the sample data. After this process, the eigenvalues and eigenmodes are computed by solving the eigenvalue problem of A. Finally, in the third step (step 3), the original dimension of the eigenmode is obtained by multiplying matrix U after obtaining the right eigenvector of the reduced system by EKFDMD.
Again, note that we can use the same formulation in Eqs 46 through 48 for an online situation in which the left singular vector (spatial mode)Ũ is known in advance. This is similar to KFDMD [15] proposed previously. In this case, a fully online algorithm can be obtained. However, if the POD mode is not known in advance and must be estimated, then an online POD method or other methods are required. If the spatial POD modes change with time as in the case of online POD, then the projected coefficients are not consistent in time. Furthermore, the POD modes are sometimes activated or deactivated in the online POD algorithm. Thus, it appears to be difficult to straightforwardly extend the EKFDMD to a method combined with the online POD, and this is left for a future study.
In the present paper, Eq 46 is adopted for the truncated POD. This procedure is used for many-DoF problems (n>30) and is not used unless otherwise mentioned. In the case of noisy dataset, it should be noted that an accurate estimate of the mode coefficient does not necessarily mean an accurate representation of the full state because the spatial POD mode contains noise as shown later. However, despite the imperfect estimation of POD modes, eigenvalue and reconstructed data by EKFDMD are sufficiently accurate, which is also shown later.

Implementation of the EKFDMD algorithm
Here, the EKFDMD algorithm is briefly summarized. After initialization, the prediction (a priori estimation) and update steps are alternately performed.

Initialization
1. If the DoF is large, trPOD is applied to the data.
x k|k−1 are predicted by Eqs 36 and 34, while a ij,k|k−1 are predicted to be the same as a ij,k−1|k−1 .

Update step
1. The Kalman gain K is computed by Eqs 39 and 38.
2. θ k|k is updated by Eqs 40 and 42, and matrix A is obtained using θ k|k−1 .
3. P k|k is updated by Eq 43.

Numerical Experiments and discussion
The EKFDMD algorithm described in Section 3.1 is adopted in the numerical experiments below.

Problem with a small number of DoFs without system noise
First, the performance of EKFDMD is investigated for the standard problem, in comparison with the standard DMD, KFDMD, tlsDMD, and optDMD. The problem is approximately the same as that considered in the previous study. [13] This problem is modified slightly to involve the system noise in discretized form for the next subsection, although only the observation noise is first considered in this subsection. The discretized eigenvalues are assumed to be positioned at λ 1 = exp[(±2πiΔt)], λ 2 = exp [(±5πiΔt)], and λ 3 = exp[(−0.3±11πi)Δt], where Δt = 0.01. The corresponding continuous eigenvalues are ω 1 = (±2πi), ω 2 = (±5πi), and ω 3 = (−0.3±11πi). The number of DoFs of this system is d = 6. The original data f were computed in the previous study as However, the above formulation cannot treat system noise. Therefore, the system is integrated for each time step size, and discretized system noise is added as follows: where v 0 is the system noise for the original system. Eq 51 exactly corresponds to the solution of Eq 49 for the condition in which v k is absent. In this subsection, no system noise is considered with v k = 0. The number of DoFs of this system is d = 6, which is expanded to snapshot data of n = 16 DoFs by applying the Q QR matrix of QR decomposition of a random matrix. Note that this problem was originally extended to n = 400 DoF, but the number of DoFs is limited in the present study because of the computational costs of EKFDMD, as mentioned above. In this process, a random matrix T of n × d dimensions in which each of the components is a random number of N ð0; 1Þ is transformed into T = Q QR R QR by QR decomposition, and the original data f k of dimension d are extended to x k of dimension n by multiplication by a matrix Q QR , as follows: Then, y data matrices are created by adding white observation noise to the original x data matrix, where the noise w k is expressed as N ð0; s 2 Þ.
Here, the variance (s 2 w ) is varied as 0.0001, 0.001, 0.01, and 0.1. A total of 500 snapshots are given, and the eigenvalues of the matrix A in the final stage are analyzed.
For the initial adjustable parameters of the Kalman filter, the diagonal parts of the variance matrix are set to be 10 3 . The diagonal elements of Q and R are set to be 0 and s 2 w , respectively, and the nondiagonal elements of Q and R are set to be 0 in this subsection. The assumption of Q = 0 corresponds to providing the information that the system noise is absent and the system is temporally constant.
The results for the noisy data while changing the noise level are discussed. Figs 1 and 2 show the eigenvalues estimated in the representative case and in all of the 100 cases we examined by changing the random number seed, respectively. The results of the estimated eigenvalues in Figs 1 and 2 show that DMD and KFDMD do not work well for accurate estimation of the eigenvalues of the system for the case in which the noise level is high. On the other hand, tlsDMD works better than DMD and KFDMD. Furthermore, optDMD and EKFDMD appear to work the best for estimation of the eigenvalues. This might be because optDMD and EKFDMD denoises the data, and a more accurate eigenvalue of the system can be obtained by the denoised data. The system identification performance of EKFDMD appears to be better than that of tlsDMD.
The above characteristics are discussed with the quantitative data. Fig 3 shows the error of eigenvalues. The errors in the eigenvalues are defined by the norm of the closest computed eigenvalue to the true eigenvalue specified. Here, outliers were not removed in this process. The error in the eigenvalues decreases with decreasing noise strength for all methods. This plot quantitatively shows that the error basically decreases with the order of DMD as well as KFDMD, tlsDMD, EKFDMD, and optDMD. The system noise is not considered in the present problem setting, and therefore optDMD can give the best-fit curve for the all of the data points, owing to its offline procedures. On the other hand, EKFDMD incrementally updates the information and cannot use all of the data at once. Therefore, it is reasonable that optDMD works slightly better than EKFDMD.
Data reconstruction is then considered. In addition, as noted previously, EKFDMD is expected to be able to denoise the data. Fig 4, which illustrates the time-series of the true data, the observation (noisy) data and the reconstructed data of DMD, tlsDMD, KFDMD, optDMD, and EKFDMD. This plot reveals that DMD and KFDMD cannot predict the oscillation because they estimate the dumping oscillation due to the noise included in the observation data. Moreover, tlsDMD can predict the oscillation for the weaker noise level. Although tlsDMD can predict neutral oscillation for a stronger noise level, as shown in Fig 4, the phase of oscillation of reconstructed data is very different from the true value. On the other hand,   optDMD and EKFDMD can successfully denoise the data, even though the noise level is very high.
The error level of the reconstructed data is quantitatively discussed in term of Fig 5, which shows the following normalized error: Fig 5 shows that the error decreases with the order of DMD, KFDMD, tlsDMD, EKFDMD, and optDMD, similar to those in the eigenvalues. This trend also shows that EKFDMD works reasonably for simultaneous system identification and denoising of the data by running the algorithm once. The better performance of optDMD, as compared to EKFDMD, originates from their online or offline characteristics. Although we are interested in the performance for the case in which system noise is present, we hereinafter discuss the effects of parameters on this problem without system noise, before discussing the problem with system noise in Subsection 4.2.

Effect of the number of snapshots m.
Here, the parameter effects for the problem without system noise are considered. First, the effect of the number of snapshots m is investigated. Similar to the previous discussion, the errors in the eigenvalues and reconstructed data for DMD, tlsDMD, KFDMD, optDMD, and EKFDMD are calculated for various values of m for data of s 2 w ¼ 0:1. These errors are evaluated by 100 runs and are averaged for each algorithm. The error in eigenvalues in Fig 6 shows that the errors of tlsDMD, EKFDMD, and optDMD basically decrease (except for some bumps), while those of DMD and KFDMD do not. Interestingly, the error of EKFDMD decreases more rapidly and is larger than that of tlsDMD for m � 200 but smaller for m � 300. This is because EKFDMD is an one-path algorithm and its accuracy in the early stage is not sufficiently high, but increases rapidly as more successive data are obtained. Note that both tlsDMD and optDMD algorithms are offline algorithms.
Then, the errors in reconstructed data shown in Fig 7 are discussed. The errors of DMD, KFDMD, and tlsDMD do not change. The errors of DMD and KFDMD do not decrease Extended Kalman filter for DMD because they cannot better predict the eigenvalues for the case in which m increases, and the errors of tlsDMD do not decrease, despite the decrease in the error in the eigenvalues, because the reconstructed data with tlsDMD have a different phase due to the very strong observation noise in the initial snapshot, as discussed previously. On the other hand, the errors of EKFDMD and optDMD decrease because both algorithms find the best-fit data for reconstruction and the accuracy of this data increases by using the information of an increased number of snapshots.

Effect of mismatched error level for R.
Next, the effect of mismatched R settings is discussed, while the system error is absent and Q is set to be 0. In the present study, we investigate the mismatched cases of R ¼ 10s 2 w I and R ¼ 0:1s 2 w I, as well as the matched case of R ¼ s 2 w I, the results of which are presented in the previous sections. The number of snapshots m is set to be 500. The errors are evaluated by 100 runs and are averaged for each case, similar to previous cases. The errors of EKFDMD in eigenvalues and reconstructed data for the case in which R is mismatched are shown in Figs 8 and 9, respectively. These figures show that the mismatched R does not affect the results, except for the strong-observation-noise case

Problem with a small number of DoFs with system noise
Next, we consider a problem with system noise. In this problem, v 0 is assumed to be N ð0; ns 2 v =6Þ, resulting in v being N ð0; s 2 v Þ, and we vary s 2 v ¼ s 2 w as 0.0001, 0.001, 0.01, and 0.1. A hyperparameter Q is set to be and R is set to be s 2 w I. The number of snapshots m is set to be 500, and a total of 100 runs are conducted for each case.  that DMD and KFDMD do not work well for the accurate estimation of the eigenvalues of the system for the case in which the noise level is high, although its accuracy is somehow improved compared with the case without the system noise. On the other hand, tlsDMD, optDMD and EKFDMD appear to work better than DMD or KFDMD. This might be because denoising Extended Kalman filter for DMD algorithms for estimation of eigenvalues of tlsDMD, optDMD, and EKFDMD works well for these data, and a more accurate eigenvalue of the system can be obtained. The system identification performance of EKFDMD appears to be as good as that of tlsDMD and optDMD in these plots. Finally, the errors of eigenvalue estimation are shown in Fig 12. Fig 12 shows that tlsDMD, optDMD, and EKFDMD work better than DMD and KFDMD. Among tlsDMD, optDMD, and EKFDMD, tlsDMD works slightly better for λ 1 and λ 2 , whereas the performance of EKFDMD is similar to that of tlsDMD for λ 3 . This result illustrates that the system identification performances of tlsDMD, optDMD, and EKFDMD are approximately the same for the case in which system noise is present.
Then, reconstruction using these algorithms, as shown in Fig 13, is discussed. Similar to the cases without system noise, data reconstructed by DMD and KFDMD are dumped in the early stage. This is again because the these algorithms predict dumping modes. The data reconstructed by tlsDMD have good amplitude of oscillations, but their phases do not match well with those of the original data. Although the data reconstructed by optDMD have good amplitude and phase, the data around peaks are sometimes not reconstructed. These errors around peaks in the reconstruction data obtained using optDMD are caused by system noise in the data because optDMD cannot handle system noise. Unlike the algorithm described above, the data reconstructed by EKFDMD shows excellent agreement with the original data. This is because EKFDMD can handle data with system noise. This characteristic can be used for simultaneous system identification and denoising of data containing system noise. The error in the reconstructed data shown in Fig 14 clearly shows this characteristic.

Effects of the balance of system and observation noises.
In this subsubsection, the effects of the balance of system and observation noises in the observation data are discussed. System noise variance s 2 v is set to be 10s 2 w and 0.1s 2 w . Here, Q and R are correctly given in this problem. In both cases, test cases with s 2 w of 0.0001, 0.001, 0.01, and 0.1 are conducted, and the results of 100 runs with different seeds for random numbers are averaged for error characteristics.
First, the case with strong system noise s 2 v ¼ 10s 2 w is discussed. The errors in the estimated eigenvalues shown in Fig 15 indicate that the errors of all of the algorithms are almost the same and the error does not decrease with decreasing noise level. This figure shows that advanced DMD methods do not significantly improve the estimation of eigenvalues for data with strong system noise. The error in the reconstructed data is shown in Fig 16. This figure shows that the error of EKFDMD is much less than the errors of the other algorithms. This indicates that EKFDMD can be used for noise reduction for the case in which the system noise is stronger than the observation noise.
Then, the case with the weaker system noise s 2 v ¼ 0:1s 2 w is discussed. Again, Q and R are correctly given in this problem. The error plots in Fig 17 show that the errors of tlsDMD, optDMD, and EKFDMD are approximately the same and are lower than those of DMD and KFDMD. This figure illustrates that advanced DMD methods improve the estimation ability of eigenvalues. The error in the reconstructed data is shown in Fig 18. This plot indicates that the errors decrease in the order of DMD and KFDMD (same as that of DMD), tlsDMD, optDMD, and EKFDMD. The figure also shows that EKFDMD performs better than optDMD, even if weaker system noise is present. This fact indicates that EKFDMD can be used for noise reduction in the range we investigated for the case in which system noise is present, regardless of its strength.

Effects of mismatched error level for Q and R.
In this subsubsection, the effects of mismatched selection of Q and R are discussed. The system noise variance s 2 v is set to be the same as s 2 w . First, the effect of mismatched Q is discussed. Fig 19 shows that mismatched Q In this case, if Q is assumed to be zero, which corresponds to the assumption of no system noise, then the error becomes noticeably larger. On the other hand, if Q is set to be 10 times or 0.1 times larger than the appropriate value, then the results are not significantly degraded. This indicates that the setting of Q does not significantly affect the results if the system noise is considered and Q is appropriately set to be within the order of s 2 v . Then, the effect of mismatched R is discussed. The error in estimated eigenvalues shown in Fig 21 illustrates that the mismatched R does not significantly change the error, although errors for smaller R or R = 0 become slightly larger. Fig 22 shows the errors in reconstructed data with mismatched R. In this case, mismatched R does not significantly affects the results. This result shows that the setting of R does not significantly affect the results, similar to the mismatched Q cases.
Finally, the effects of mismatched Q and R, but with the condition Q 1,1 = R, are investigated. The errors in the estimated eigenvalues and reconstructed data for the cases in which Figs 23 and 24, respectively. These figures show that the results do not change for the case in which the ratio of Q 1,1 to R does not change. As noted earlier, the ratio of Q and R should be carefully chosen in order to achieve accurate estimation.

Problem with a moderate number of DoFs without system noise
Next, a similar problem, but with the number of DoFs extended to 200 by the same procedure, is adopted with the same noise levels. In this case, the computational cost is very high, and we  Extended Kalman filter for DMD conducted trPOD as a preconditioner. In this problem, first, the number of DoFs is reduced from 200 to 10 by trPOD, and the reduced data are processed by EKFDMD. On the other hand, for the purpose of comparison, DMD and tlsDMD are applied directly to the data for 200 DoFs in order to reduce the number of DoFs to 10 because these algorithms can treat a data matrix of this size within a reasonable computational time by inherently involving truncated SVD (same as trPOD). In this problem, 500 samples were given. Similar to the previous example, the diagonal elements of the covariance matrix were set to be 10 3 in the initial condition. The diagonal elements of Q and R are set to be 0 and s 2 w , respectively, and their nondiagonal elements are set to be 0.
The results of trPOD are shown in Fig 25, where the first POD spatial mode obtained by data without noise and that obtained by data with noise are plotted together. Note that the mode of the node distribution in snapshots is referred to as the POD spatial mode, which is analogous to fluid analysis. This plot indicates that the noise level is very high and that the Extended Kalman filter for DMD estimation of the POD spatial mode is not accurate. However, the contaminated POD modes obtained by data with noise are used for EKFDMD.
The eigenvalues and their errors for this problem are shown in Figs 26, 27, and 28. Except for the condition with strong noise (s 2 w ¼ 0:1), trPOD+EKFDMD works better than DMD, KFDMD, and tlsDMD, while optDMD works best. This characteristic does not change from the small-degree-of-freedom problem, as shown earlier. The degradation in performance of the trPOD+EKFDMD for the very noisy condition might occur because the important signal is filtered out in the POD procedure. This characteristic is relaxed by increasing the number of POD modes, as shown later herein, but the number of POD modes is in a trade-off relationship with the computational cost. The reconstructed data are then shown in Fig 29. Even if we apply POD, the reconstructed data of trPOD+EKFDMD and optDMD agree well with the original data in all the condition, whereas DMD, KFDMD, and tlsDMD fail to capture the behavior of the original data in the severe noise cases. The error in the reconstructed data is shown in Fig 30. As shown earlier, the error of trPOD+EKFDMD is smaller than that of  tlsDMD and is larger than that of optDMD. Thus, trPOD+EKFDMD works reasonably well in reconstructing the data even with the imperfect POD modes shown in Fig 25.

Effect of POD truncation.
For POD truncation, the rank number should be manually specified. Therefore, the effect of the rank number chosen by the user is investigated. Here, r = 6 and r = 20 are investigated, where the previous standard cases were computed with r = 10, as noted earlier. The errors in the estimated eigenvalues and reconstructed data of the r = 6 and r = 20 conditions are shown in Figs 31, 32, 33 and 34, respectively. For the case in which system noise is absent, the errors of the estimation of eigenvalues by trPOD+EKFDMD does not work well with r = 6 for s 2 w � 0:01, and the resulting error in reconstructed data is slightly worse than that for tlsDMD for all cases with different noise levels. This might be because trPOD filters out the important signal and trPOD+EKFDMD cannot recover the original signal for strong-noise cases. On the other hand, the errors in the estimated eigenvalues of trPOD+EKFDMD with the r = 20 setting are lower than those of tlsDMD or are approximately the same as (and sometimes slightly higher than) that of tlsDMD and the error in the reconstructed data of trPOD+EKFDMD with r = 20 is smaller than that of tlsDMD. Therefore, using tnPOD+EKFDMD with better performance requires a larger rank. This is clear trade-off between the estimation accuracy and the computational cost.

Problem with a moderate number of DoFs with system noise
Next, we consider a similar problem in which system noise is adopted. The system noise variance s 2 v is set to be s 2 w , similar to the small-DoF problem shown earlier. With regard to the EKFDMD procedure, trPOD is used as a preconditioner similar to the previous subsection. Again, in this problem, the number of DoFs is reduced from 200 to 10 by trPOD, and the reduced data are processed by EKFDMD. On the other hand, DMD, tlsDMD, and optDMD are applied directly to the data for 200 DoFs in order to reduce the number of DoFs to 10. Moreover, in this problem, 500 samples were given. The diagonal elements of the covariance matrix are set to be 10 3 in the initial condition. The diagonal elements of R and Q 1,1 are set to be s 2 w and s 2 v , respectively, and the nondiagonal elements of R and Q 1,1 are set to be 0.  performance for trPOD+EKFDMD is not found in this case, together with the results later shown herein. Then, the reconstructed data are shown in Fig 38. Fig 38 illustrates that DMD, KFDMD, and tlsDMD fail to capture the behavior of original data while optDMD works reasonably but sometimes fails to capture the behavior around peaks. Even if we apply the POD These plots are similar to those with a truncated POD of r = 10, which indicates that the rank for the POD truncation does not significantly affect the results for the case in which system noise is present.

Application to a fluid problem
The simulation of a two-dimensional flow around a cylinder is conducted. The Mach number of the freestream velocity is set to be 0.3, and the Reynolds number based on the freestream velocity and the cylinder diameter is set to be 300. For the analysis, LANS3D, [25] which is an in-house compressible fluid solver, is adopted. A cylindrical computational mesh is used, with the numbers of the radial-and azimuthal-direction grid points being 250 and 111, respectively. A compact difference scheme [26] of the sixth order of accuracy is used for spatial derivatives and a second-order backward differencing scheme converged by an alternative-directionalimplicit symmetric-Gauss-Seidel method [27,28] is used for time integration. See Reference [29] for further details. The origin point is set to be the center of the cylinder, and a resolved region (where the mesh density is finer) is set to be inside 10d far from the origin point. Here, d is the diameter of the cylinder. For any DMD analyses, the quasi-steady flow data at x = [0, 10d], y = [−5d, 5d], which is in the wake region, are used. The data are mapped to an equally distributed 100 × 100 mesh. The DMD analyses processed 500 samples of five flow-through data with or without adding observation noise of N ð0; s 2 w Þ, whereas the variance (s 2 w ) is set to be 0.02. In the EFKDMD algorithm, the diagonal parts of the covariance matrix are initially set to be 10 3 , similar to previous problems. The diagonal elements of Q and R are set to be 0 and 0.02, respectively, while nondiagonal elements of Q and R are set to be 0.

Complexity and computational cost
In this section, the complexity and computational cost of EKFDMD are discussed. Here, multiplication for single elements is assumed to have a complexity of O (1), and the multiplication of matrices of size of l × m and m × n is estimated to be O(lmn) under the dense matrix computation. In the EKFDMD procedure, except when using trPOD as a preconditioner, the main computational cost comes from Eqs 36 and 37 for the prediction step and from Eqs 39, 38, 40 and 43 for the updating step. For each step, the computational complexity is summarized in Table 1. In total, the most significant complexity is considered to be O(n 6 ) for one step. Therefore, if we have m samples, then the computational complexity for m-time steps becomes O(mn 6 ). The complexity and the required memory of EKFDMD are compared with those of the other algorithms in Table 2, where estimation of the complexities of DMD and online DMD in the previous study [14] are adopted, and the complexity of KFDMD is estimated in the present study. In addition, Fig 50 shows the computational time for 500 samples with different DoF problems. The Matlab software is used with Intel Xeon E5620 2.4GHz processor. The computational time is averaged over 20 runs for the small size of m < 50, while it is not for the large size but the repeatability is confirmed. Both Table 2 and Fig 50 show that EKFDMD requires significant computational cost, and applying trPOD as a preconditioner is strongly recommended for the practical use of EKFDMD. In practical use, matrices F and H for EKFDMD are sparse and the corresponding computational cost and memory of EKFDMD can be decreased by using implementations of routines for the sparse matrix in the software  utilized as we did. However, the complexity of EKFDMD is still higher with the routines for the sparse matrix than the other algorithms as shown in Fig 50.

Conclusions
A dynamic mode decomposition method based on the extended Kalman filter (EKFDMD) was proposed for simultaneous parameter estimation and denoising. The numerical experiments of the present study reveal that the proposed method can estimate the eigenstructure of Extended Kalman filter for DMD the matrix A better than or as well as existing algorithms in its online procedure for a problem with a small number of DoFs, whereas EKFDMD simultaneously denoises the data. In particular, the EKFDMD works better for data reconstruction in the case in which the system noise is present than existing algorithms. However, this algorithm has the drawback of computational cost. This drawback is addressed by preconditioning of truncated POD (trPOD), thouth it prevents the algorithm from being fully online. Then, EKFDMD with trPOD is applied to a problem with a moderate number of DoFs and a fluid system. The performance of EKFDMD is slightly degraded by decreasing the rank number of trPOD in the case without system noise   while the performance does not change in the case with system noise with regardless of the rank number. It should be noted that all the performance of EKFDMD is preferable in the analysis of noisy data.