Quantitative Assessment of the Log-Log-Step Method for Pattern Detection in Noise-Prone Environments

Staircase-like structures in the log-log correlation plot of a time series indicate patterns against a noisy background, even under condition of strong jitter. We analyze the method for different jitter-noise-combinations, using quantitative criteria to measure the achievement by the method. A phase diagram shows the remarkable potential of this method even under very unfavorable conditions of noise and jitter. Moreover, we provide a novel and compact analytical derivation of the upper and lower bounds on the number of steps observable in the ideal noiseless case, as a function of pattern length and embedding dimension. The quantitative measure developed combined with the ideal bounds can serve as guiding lines for determining potential periodicity in noisy data.


Introduction
The detection of patterns against a noisy signal background is a particularly important task for engineering and neuroscience [1][2][3][4][5][6][7]. Traditional approaches like Fourier analysis quickly break down under these conditions, or are far too ambiguous to be helpful from first principles (template-matching methods). Here, we assess the usefulness of an auxiliary tool. By providing information on the length and on metric aspects of putative patterns enclosed in a time series, the tool can guide the search for patterns. Although earlier [8,9] the power of this method has been exemplified, so far no quantitative overview on its efficacy could be provided. In the present contribution, we introduce such a quantification.
Given a time series fa 1 ,a 2 ,:::g embedded in m-dimensional space using the standard coordinate-delay construction x (m) k~( a k ,a kz1 ,:::, a kzm{1 ) [10][11][12][13][14], in the log-log plot of the correlation integral C (m) N ( ) instead of a straight line needed for the evaluation of the fractal dimension and correlation [15][16][17][18], steps may emerge. These steps emerge if the embedded points follow a simple generating pattern. Simple generating patterns lead to clusters of points in the embedding space that, in turn, lead to a sudden increase in the log-log plot of the point densities. This can be seen by choosing a random reference data point. Around this point, we enlarge the neighborhood radius , counting the points that fall into this neighborhood. After reaching a cluster of points, the count C( ) quickly increases with , which leads to a step-like structure in the plot of C( ). Given a time series generated from a noise-free pattern of length n and using the maximum norm, these steps are sharp, and the number of steps visibly decreases with m. From the way how these steps propagate through the different embedding dimensions, we are able to derive upper and lower bounds to the observable number of steps appearing under ideal conditions as follows: For n odd, the lower bound t and the maximal number s of steps have the expression For n even, the lower bound t and the maximal number of steps s have the form and s(n,m)~( These results extend and detail insights from previous approaches [9].
By searching for steps, we can not only pin down data that are likely to contain patterns. With the help of the table presented in Fig. 1, we can also infer the length of putative patterns.

Method validation
To what extent is the method reliable? In realistic time series, especially in neuroscience, a regular signal will be contaminated by jitter and noise. Jitter is commonly defined as the addition of an amount of signed (or unsigned) noise to the signal. Under its influence, a period-three signal of interspike intervals (ISIs) f3200,7700,1000g may change into a time series such as f3223,7703,907,3203,7782,903,3107,7603,1098, . . .g: For this example, we added a jitter of 10 percent of the smallest ISI to the data, drawn from a uniform probability distribution. Alternatively, Gaussian or long-tailed distributions can be considered, which leads, in the range of interest, only to negligible differences. Noise is implemented by choosing a given percentage of the ISIs according to some random probability distribution. This can be achieved in two manners that reflect different ways of how the regularity-generating network is linked to the noisegenerating part of the network: a) We can choose the next signal event with a probability p from the regular pattern and with a probability (1{p) from the random distribution. b) Alternatively, with probability p 0 the whole regular pattern of length n provides the n next signals, whereas with probability 1{p 0 the signal event is drawn from the random distribution (for a fair comparison among the different paradigms, the probabilities must be rescaled as p 0~p =(n{(n{1)p)): Motivated by neuroscience applications, here we focus for our results on the second paradigm.
Upon the addition of jitter and noise, the steps gradually smear out and finally may no longer be visible. An example of a log-log plot displaying a step-like behavior is shown in Fig. 2. The following analysis focuses on a pattern of length n~3. The analysis has, however, also been performed for patterns of length 5 and partially for length 7, with comparable results. Longer patterns have obtained little attention in experimental time series [19], [20].
In the log-log plot, jitter predominantly smoothens out the steps, whereas noise decreases the heights as well as the slopes of the stairs. We assess the ability of our method to highlight regular patterns of length n in jitter and noise contaminated data with the help of three criteria: a) How well can the predicted decrease of the number of steps with the embedding dimension m be evidenced? b) How well can exactly n steps in the embedding dimension m~1 be detected? c) How well is a plateau, the flat part of the graph prior to the step, at embedding dimension m~n expressed if compared to that observed at m~nz1 [9]?
For the first criterion, we verified whether the predicted decrease of the number of steps as a function of m was observed or not. To this end, we tested whether a single vertical step was visible at m~n. For this we preset three height levels h with corresponding quality weights w (denoted fh,wg) ff1:0,2g, f1:2,3g,f1:5,3g,f2:0,2g,f3:0,1gg that in the ideal case the derivative of the log-log plot would all exceed. Given a particular preset height level, we rewarded the detection of exactly one peak in the derivative (corresponding to a sharp step-like increase in the original plot) with the level's corresponding weight and used the resulting sum over the height levels as 'quality' measure. Added noise, however, may trigger a reappearance of the theoretically vanishing steps at and beyond the embedding dimension at which only one step should emerge. To eliminate this problem, if two or three steps emerged in the data, we compared the time series vs. surrogate (i.e. randomly permuted) series, in which the repeated steps emerge most pronouncedly. To characterize the distance from the surrogate case, the quality of the time series data was subtracted from the surrogate quality. The final 'quality' measure was thus composed as sum of a first measure for the visibility of exactly one step and a second term which, being nonzero only in the case of two or three observed steps, reflects the distance from the surrogate case.
For the second criterion, in order to quantify the visibility of exactly three derivative peaks at m~1 we proceeded with a peakdetection algorithm as in criterion (a) yet with a slightly different attached level-weight-vector ff0:75,2g,f1:0,3g,f1:5,3g,f2:0,2g, f3:0,1gg. For the third criterion, the plateau flatness at m~n was compared to m~nz1. A plateau was counted, if the derivative of the plot was below values ff0:2,1g,f0:4,2g, f0:7,2g,f1:1,1gg (again with corresponding weights w, fh,wg). The weighted average counts obtained for mz1 were then subtracted from the weighted average counts obtained for m.
For all criteria assessments (a), (b) and (c), we approximated the derivative values as difference quotients between two consecutive data points, for which log e was increased in steps of 0:02. Certainly the above described algorithms are not the unique possibility to reasonably quantify the proposed criteria. We however argue that the algorithms, together with the carefully selected weight vectors, do provide a measure which is in accordance with the human eye's perception of peaks and plateaus.
By dividing through the observed maximal measure, the three measures were normalized and a contour-plot with suitable contours was drawn. Fig. 3 shows the results obtained. We defined two or three regions of various visibility for each of the criteria. Not surprisingly, the visibility of exactly n peaks for m~1 (Criterion (b)) is best in the case of little noise and little jitter. Nevertheless, the visibility is considerably good for noise fractions up to 50 or 60 percent. It is natural, however, that results would be worse in the case of longer patterns or steps being more closely located. Clearly, the seven steps of a length-7 pattern are more difficult to distinguish since with increasing jitter the peaks in the derivative may overlap. Criteria (a) and (c) are what we consider to be the strongest indicators for the occurrence of patterns. The emergence of the ''natural'' situation m~n -where patterns are completely inserted but no additional terms spoil the characteristic behavior -is most helpful in the case of little jitter but high noise values. In regions of up to 90 percent of noise, when all other methods normally fail, the plateau occurring at m~n compared to m~nz1 reliably indicates a pattern of length n. We tested criterion (c) for a generic pattern of length 5 comparing the dimensions m~5 and m~6 using exactly the same algorithm. Even though there are theoretically two visible steps in this case, the two plateaus quickly merge into a single one. The resulting plot looks very similar with even a slightly extended range of visibility. We thus suppose criterion (c) to be fairly independent of the underlying pattern length. In regions where the criterion (c) fails, i.e. for little noise and high jitter, criterion (a) may serve as indicator of the pattern length. The visibility of one single step in dimension m~n alone yet does not prove a pattern length n, since patterns of length ƒn may also lead to such a single step. Comparing to the embeddings mƒn where more steps should occur helps to exclude these cases. Moreover, high jitter values may merge two steps, if these steps are close together. The possible overlap of neighboring steps thus sets the natural limit to the method. Yet this happens only in the case of highly jittered signals or specific patterns having two distinct distances very close together. In the latter case, nonetheless still a pattern will be indicated, albeit of the wrong length.

Proof of the analytical formula for s(n,m) and t(n,m)
For a proof of (2)-(5), we decompose the graph of componentwise distances d ij :~ja i {a j j into subgraphs connecting nearest-, next-nearest-, etc. neighbors, see Fig. 4. The idea underlying the optimized proof with sharper bounds is, as in the old proof in [9], the following: The choice the maximum norm makes is restricted to consecutive d ij 's on one distinct subgraph. For m~1, every 'comparison' yields a winner, hence we have n(n{1)=2 steps. For larger m, the ordering of d ij on the subgraphs is crucial. When m~2 and n~6, a monotonous ordering d 61 wd 12 wd 23 wd 34 wd 45 wd 56 yields n{1 steps; in m~3, n{2 steps, and so on. Contrarily, if we have a 'regular' distribution of the biggest three distances d ij : d 12 wd 34 wd 56 wremaining d ij , only 3 steps are contributed when m~2. For odd n, each subgraph follows the rules for the monotonous ordering of a maximal number, from where we get n{(m{1) steps, and for a regular ordered set qn=mr steps. From this, we arrive at t(n,m)~(n{1)=2 : qn=mr and s(n,m)~(n{1)=2 : (n{(m{1)). For even n, n=2{1 subgraphs follow the same rules as above, except for the one with n=2 lines, which only contributes one step if mwn=2.

Discussion
To summarize, we emphasize the remarkable performance of the method under very noisy conditions. As a general advice (generally true for time series analysis!) we propose not to rely on one single criterion, but to combine all aspects to obtain a coherent picture. The reader may thus derive an overall goodness-ofmethod measure by adding the measures obtained from the different criteria. This might help to a priori evaluate the applicability of the method to a user's problem. As guideline for the practical use of the method, we suggest to embed a given time series in spaces of multiple dimensions m up to m §10 in order to capture possible pattern lengths of such order. Regarding the computation of the correlation integral, it is important to sample densely enough among randomly selected reference points (e.g., for 10.000 data points, we recommend something above 200 reference points). Equipped with the log-log curves for multiple m, a significant plateau flatness difference between to consecutive m (criterion (c)) can serve as first indicator of the pattern length [9]. Criteria (a) and (b) may be helpful to confirm such a suspicion and to gain additional, metric information about the pattern.  Moreover, the slope of the lines in the step-free regions may give interesting insights into the fractal dimension of a possible attractor.

Materials and Methods
All computations were performed in a C++ and Mathematica environment on a custom laptop. The method validation was based on the length-3 pattern f3200,7700,1000g. The correlation integral was evaluated for a total of 9900 embedded points, where 1000 points were used as reference points. For a total of 11|11 jitter-noise-configurations (from 0% to 100% in steps of 10%), we evaluated the three described criteria. A set of levels appropriate for the classification into the 'excellent', 'fair' and 'ambiguous' evaluation regimes resulted in the contour-plots shown in Fig. 3.