Improvement of Source Number Estimation Method for Single Channel Signal

Source number estimation methods for single channel signal have been investigated and the improvements for each method are suggested in this work. Firstly, the single channel data is converted to multi-channel form by delay process. Then, algorithms used in the array signal processing, such as Gerschgorin’s disk estimation (GDE) and minimum description length (MDL), are introduced to estimate the source number of the received signal. The previous results have shown that the MDL based on information theoretic criteria (ITC) obtains a superior performance than GDE at low SNR. However it has no ability to handle the signals containing colored noise. On the contrary, the GDE method can eliminate the influence of colored noise. Nevertheless, its performance at low SNR is not satisfactory. In order to solve these problems and contradictions, the work makes remarkable improvements on these two methods on account of the above consideration. A diagonal loading technique is employed to ameliorate the MDL method and a jackknife technique is referenced to optimize the data covariance matrix in order to improve the performance of the GDE method. The results of simulation have illustrated that the performance of original methods have been promoted largely.


Introduction
The problem of single channel source number estimation (SCSNE) is being widely investigated in many fields, e.g., image processing, fiber communication [1][2][3], nondestructive testing [4] and single channel blind signal separation (SCBSS). In recent years, SCBSS has been considered as one of the most challenging research topics in these areas [5][6][7][8]. Numerous algorithms have been carried out to solve the SCBSS problem, such as multiple signal classification (MUSIC) [9], estimating signal parameters via rotational invariance techniques (ESPRIT) [10], and independent component analysis (ICA) [11]. However, the performance of these blind source separation (BSS) algorithms and signal spectrum estimation methods could be significantly deteriorated with inaccurate estimation of source number.
In order to process the single channel signals, it is acceptable to expand the data dimensions and convert the single channel data to a multi-channel form. Then many long-tested array signal processing algorithms can be used for references. Nowadays, multiple sampling method [12], the signal sparse representation [13], delay process [14] and many other methods can be used to extend the size of the data. Among them, the multiple sampling method needs cooperation of the data sample front end. Sparse decomposition method costs a high computational complexity, and not all signals are provided with sparse characteristics. Delay process is relatively simple and low complexity; however it needs relatively more snapshots.
Many excellent algorithms of source number estimation have been found in the literatures, such as the methods based on hypothesis testing method [15,16], information theoretic criteria (ITC) based approaches [17] and Gerschgorin's disk estimation (GDE) method [18]. The hypothesis testing based method has to set an artificial threshold, which does not constitute an easy decision under some circumstances. Methods based on ITC are established upon the differences between eigenvalues of signal and noise, mainly represented by Akaike information criteria (AIC) [17] and minimum description length (MDL) [19]. Although the computational complexity is relatively low, they can not work with colored noise. GDE method can deal with the colored noise, but its performance exacerbates at low SNR condition.
In addition to the above methods, Keyong Han proposed a method which is based on jackknife technique for array signal source number estimation [20]. The jackknife technique was introduced to reduce the estimator bias in the general context by Quenouile [21]. It is a resampling method that is most frequently used [22]. It has received widely researched in many applications because of its simplicity [22][23][24][25]. A method has been proposed which exploits eigenvectors instead of sample eigenvalues to detect the source number in array signal processing [26]. Nonetheless, it is not quite appropriate for the single channel application. Jayme G. A. Barbedo presented a two-stage procedure to determine the number of sources present in a single-channel music signal [27]. Lei Huang proposed a MMSE-Based MDL source number estimator with the prior knowledge of training sequence [28].
Although the AIC method achieves high detection performance at low SNR, it is not consistent. Owing to its properties of simplicity and consistency, the MDL estimator has become the standard tool for estimating the number of sources [29]. This paper detailed the application of MDL and GDE methods to estimate the source number of single channel received signal. The single channel data is converted into a multi-channel form by delay process. Simulation results indicate that the MDL method cannot estimate the source number effectively when the received signal is contained with colored noise. The GDE method underperformance at low SNR, though it can manage with colored noise. To improve the detection performance of these methods, a diagonal loading technique is introduced for the MDL method and the jackknife technique is used to optimize the data covariance matrix for the GDE. The simulation results show that the method put forward in this paper can effectively get the source number of the single-channel received signal and the betterment presented in this paper obtains a remarkable progress.
The remainder of this paper is structured as follows: Section 2 introduces the model of the single channel received signal. Section 3 shows the single channel dimension expansion method. Section 4 describes the source number estimation algorithms and the improvement approaches proposed by this paper. Section 5 shows the experimental results to verify the performance of the improved methods, which is followed with the conclusions in Section 6.

Signal Model
A multi-channel received signal model with p channels and q source signals is outlined as follow [30][31][32][33].
While in the single channel condition, only one data can be observed at every time point. The signal model becomes as [34,35] where T is a source signal matrix. Noise and signal are independent. The parameter estimation of single channel signal is an underdetermined problem. It is difficult to process the single channel signal directly.
On the other hand, the estimation of source number in array signal processing is relatively simple and convenient. Therefore, it is a considerable approach to convert the single channel signal to an array signal processing problem.

Construction of Multi-Channel
The single channel received data is a one dimensional matrix. In order to make use of the multi-channel source number estimation algorithms, it is essential to expand the dimension of the data matrix. This paper uses delay process to construct the multi-channel data matrix from the single channel received data. Assuming that the single-channel data are denoted as y(n), n = 1, 2, . . ., L, the received signal can be expressed as follow with delay process.
Therefore, a N channels received data is formed.
where d describes the delay length of each channel. N is the total number of channels, which could not be less than the number of sources. The data of each constructed channel should be passed through a filter which intends to have a more realistic respond of received channel. Set the frequency response of each filter as where, |H 0 (e jω 0 )| represents the amplitude frequency response of a prototype FIR filter. And φ i (ω) denotes phase frequency response of each channel. Therefore, the final multi-channel data matrix can be expressed as follow.
where, h i , i = 1, 2, . . ., l are the impulse response functions of the filters for each channel. With the process mentioned above, a single channel received signal with 1 × L received data is transformed to a multi-channel signal with N × (d + l-1) dimensions.

Source Number Estimation Algorithms MDL method
Assume that the noise and source signals are mutually independent and unrelated. Then the covariance matrix of the observed signal can be rewritten as follow.
where, R s = E{S(t)S T (t)} is the signal covariance matrix and σ 2 I represents the covariance matrix of the noise. Transform R X by eigenvalue decomposition (EVD) to obtain its eigenvalues.
However, due to the limited number of snapshots and the impact of multipath propagation, the eigenvalues of noise and signal are promiscuous. Some criteria have to be introduced to estimate the number of sources precisely. MDL is utilized to select the model depending on the concept of the shortest code length, proposed by Rissanen [19,36]. The estimation function of MDL can be show as follow.
where,Ŷ represents the maximum likelihood estimation of the parameters Θ and k is freedom degrees of the parameter vector Θ. N is the number of samples and f ðXjθÞ indicates the joint probability density of X. The first term of the right end denotes model parameter estimation function. The second term represents the penalty function. When MDL are applied as source number estimates criteria, the model is expressed as follows [37] MDLðkÞ ¼ À NðM À kÞlog f ðkÞ þ where And the number of the source can be yielded by minimizing the function of MDL.
The method of MDL estimates the source number by the difference of eigenvalues between the signal and the noise. Its computational complexity is relatively low, but it cannot process the signal with colored noise. This paper brings in the diagonal loading technique to overcome the influence of the colored noise [38][39][40].
where, λ i , i = 1, 2, . . ., p are the original eigenvalues of R X , and λ DL is the loading value. β i is the final eigenvalues with diagonal loading. Diagonal loading process produces little effect to the bigger eigenvalues, which are corresponding to source signals. And the smaller eigenvalues corresponding to the noise will converge to near the loading value λ DL . Therefore, by diagonal loading process, the noise eigenvalues are approximately equal and the effect of this process is equivalent to whitening the colored noise. The choice of loading value λ DL has a great influence on this method. A small value of λ DL plays little improvement to the estimation and an oversize λ DL may destroy the difference of eigenvalues between noise and signal.
Therefore, the loading value λ DL should satisfy the follow equations.
where λ N relates to the noise eigenvalue, and λ S represents to the signal eigenvalue. To meet this condition, a feasible selection of λ DL is described as [38] l DL ¼ Then the eigenvalues of R X are transformed to follow by diagonal loading.

GDE method
For a p × p matrix R, r ij is the element of row i and column j. The sum of the other elements in the i row except the i element is defined as The ith disk represents the circle with center of r ij and radius of C i in the complex plane.
Gerschgorin has proven that eigenvalues of matrix R are included in the region of disk O i , i = 1, . . ., p. So that the eigenvalues of R satisfying at least one of the following inequality.
The signal covariance matrix is expressed as follow by EVD.
Where U is a p-order unitary matrix constructed by eigenvectors of the covariance matrix R, U = [u 1 , u 2 , . . ., u p ], UU H = I. Λ is a p-order diagonal matrix whose diagonal elements are the eigenvalues of matrix R, Λ = diag [λ 1 , . . ., λ p ]. R can be rewrote in the form of block matrix.
Among them, R 1 is a p-1 order submatrix obtained by deleting the last row and column of R; r is a column vector constructing by the former p-1 elements in the p column of R, that is r = [r 1p ,. . ., r (p-1)p ] T . R 1 can be transformed as follow by EVD.
where, U 1 is a p-1 order unitary matrix composed by the eigenvectors of R 1 , U 1 ¼ ½u 0 1 ; . . .; u 0 pÀ 1 . Λ 1 is a diagonal matrix with eigenvalues of R 1 being the diagonal elements, where the eigenvalues are sorted in descending order. The relationship between the eigenvalues of R 1 and R can be written as follow.
Then a p order matrix U d is constructed.
Transformed covariance matrix R with U d . From the above equation, the radius of the former p-1 Gerschgorin's disk of matrix S can be written as The noise eigenvectors u 0 i ; i ¼ q þ 1; . . .; p À 1 of matrix R 1 are orthogonal with the column vectors of A 1 , so C i = 0, i = q +1, . . ., p − 1. But the signal eigenvectors are not orthogonal with A 1 and the R s is full rank matrix, so C i 6 ¼ 0, i = 1, . . ., q. Therefore, the covariance matrix S can be further simplified.
The Gerschgorin disks of S are split into two groups. The one with nonzero radius and larger center point is belong to the actual signal and the others is corresponding to the noise. Therefore, the number of nonzero Gerschgorin disk radius in S is equal to the estimated source number. In practical applications, due to the limited number of snapshots, there will be some bias and the disk radiuses of noise are not always zero. So GDE method is put in place to determine the number of sources.
where, D(N) is the adjustment factor with a value between 0 and 1. Calculating Eq (27) from k = 1 to k = p, when GDE(k) come up with negative value first time, the calculation will be stopped and the estimated number of sources is k − 1.
The GDE method gets the source number by utilizing the size of the radius of the transformed covariance matrix. Comparing with methods that based on ITC, it has certain superiority of keeping a well-behaved in the condition of colored noise.
However, the GDE method is restricted by its poor performance at low SNR. This paper introduced jackknife technique to optimize the data covariance matrix. Then the influence of noise can be reduced to enhance the performance of GDE at low SNR.
Jackknife technique is an effective strategy utilized in the statistical area. It can take full advantage of the received data to achieve more accurate detection and estimation [20]. For jackknife, the data structure is reconstructed by deleting a part of data every time and it can reduce the bias of the estimator.
A N × d data matrix has been obtained from a 1 × L single-channel data by delay process.
where, Y is regarded as a set with N elements, Y = [Y 1 , Y 2 , . . ., Y N ] T . M elements are chosen randomly from Y to form a subset X.
Repeating the above process K times, a series of covariance matrixes are obtained, R j , j = 1, 2, . . ., K. Calculate the average of these matrices.
where, " R a is the optimized covariance matrix, which will be used to estimate the source number by GDE.

Experimental Studies
This section gives some experimental results to evaluate the performance of each source number estimation method mentioned above.
The first experiment compares the estimation performance of MDL and GDE when the received signal is contained with white Gaussian noise. Assuming that the single channel received signal appears with 3 independent source signals, which are constructed in Matlab environment. The observation data length is set to be L = 5000, and the delay is d = 600. Through the use of delay process, the number of virtual channel is M = 8. Consequently, the single-channel data is expanded to a 8×600 data matrix. The SNR of the received signal is changed from -20dB to 20dB, with 2dB increments. The experiment with each condition is repeated with 500 times. The first experiment result is shown in Fig 1. It can be inferred from Fig 1 that both MDL and GDE methods achieve a good estimation performance with white Gaussian noise. However, due to the fact that the Gerschgorin disk radius are influenced by noise relatively easy, the performance of GDE method is slightly worse than that of MDL at low SNR.
The second experiment compared the estimation performance of MDL and GDE with colored noise. The condition of this experiment is the same as the first one, except that the white noise is replaced with colored noise. The 500 experimental results of Monte Carlo are provided in Fig 2. It shows that the method of MDL has lost its estimation ability with colored noise, but GDE method still keeps a good performance.
The third experiment compared the estimation performance of MDL, improved MDL, GDE and improved GDE with colored noise. The experimental condition is set the same as the first and second ones. Fig 3 shows the detection probability of these methods under different SNR.
It can come to conclusion that the boosting put forward by this paper makes the performance of MDL and GDE methods get improvement largely. The improved MDL method proposed by this paper can eliminate the effect of colored noise. The GDE method keeps a superior performance at low SNR.

Conclusion
It is a widespread and challengeable problem to enumerate the source number of single channel received data in many fields. For example, source number means the number of defects in material under testing in nondestructive testing such as the multi fatigue cracks in rail [41]. Therefore it should be estimated accurately for further defect quantification [42]. This paper investigates the MDL and GDE methods to solve this problem. In order to effectively utilize the algorithms used in array signal processing, the single channel received data is transformed into multi-dimension form by delay process and the data covariance matrix is constructed. The main contribution of this paper is that diagonal loading technique referenced to optimize the MDL method. Therefore, it can eliminate the influence of colored noise. Another achievement of this paper is employing the jackknife technique to optimize the performance of the GDE method at low SNR. The experimental results proved that the improved MDL method can eliminate the effect of colored noise. Furthermore, the improved GDE method obtains a superior performance than the original one.