Short-time fractal analysis of biological autoluminescence

Biological systems manifest continuous weak autoluminescence, which is present even in the absence of external stimuli. Since this autoluminescence arises from internal metabolic and physiological processes, several works suggested that it could carry information in the time series of the detected photon counts. However, there is little experimental work which would show any difference of this signal from random Poisson noise and some works were prone to artifacts due to lacking or improper reference signals. Here we apply rigorous statistical methods and advanced reference signals to test the hypothesis whether time series of autoluminescence from germinating mung beans display any intrinsic correlations. Utilizing the fractional Brownian bridge that employs short samples of time series in the method kernel, we suggest that the detected autoluminescence signal from mung beans is not totally random, but it seems to involve a process with a negative memory. Our results contribute to the development of the rigorous methodology of signal analysis of photonic biosignals.

Widely accepted underlying mechanism which generates biological autoluminescence (BAL) is related to a chemical generation of electron-excited states of biomolecules in the course of oxidative metabolism and oxidative stress [18,24]. While the intensity and optical a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 spectrum properties of BAL as a factor of various influences have been widely investigated [3,[25][26][27][28][29], there is limited knowledge and consensus about statistical properties of BAL.
The object of our current study is the BAL time series from the seeds of mung beans that were measured using a sensitive photomultiplier setup. We decided to test the hypothesis if the BAL signals of mung beans contain any intrinsic correlations. To that end, we recorded and analyzed the time series of the BAL from mung beans. One of the ways to assess correlations in the signal employs chaos-and fractal-based approaches [30]. We focus here on the analysis of the fractal character of time series using fractional processes.
Fractional Brownian motion (fBm) and fractional Gaussian noise (fGn), introduced by Mandelbrot [31], have been intensively investigated over the last few decades. They are both dependent on Hurst [32,33] exponent H 2 (0; 1) that influences their autocovariance structure. The fBm or fGn assumption of finite sample is advantageously used in many fields of research of time series analysis-in network traffic modelling [34,35], financial time series [36,37], or in biomedicine especially for detection of Alzheimer's disease [38] and cardiology [39].
When analyzing real-world data, the measured sample is usually discrete and short. The traditional methods are generally not suitable for short time series analysis. That is the reason why we need to use a precise method that can estimate the Hurst exponent without bias and can determine the confidence intervals of the estimate. The fractional character of data can be measured via fractional Brownian bridge model, which is a discrete process derived from traditional continuous fBm. A lot of time series are short due to their nature or cut by purpose or experimental limitations. Reconsidering some fBm properties that are taken in long time series analysis as granted and customizing them into a short-time, the discrete model allows estimating Hurst exponent of the discrete measured signal. This approach is advantageously used in a recently developed method of fractional Brownian bridge [40].
The article at first analyzes current open questions of statistical properties of biological autoluminescence. In the next section, we then describe the theory of fBm and the method of Hurst exponent estimation as well as other employed methods, whereas the last section contains the results of the analysis of experimental signals compared to computer-generated reference signals.

Rationale for the need of understanding of BAL statistical properties
Multiple authors proposed that statistical properties of BAL time series might contain an information related to the state of biological system [41][42][43]. If the existence of such nontrivial statistical properties was rigorously confirmed, it would make a substantial impact on three major areas of this research field.
At first, the discovery of nontrivial statistical properties of BAL would have an impact on the understanding of the BAL generating mechanisms [18,21]. So far, well-accepted generating mechanism of BAL [18,24] implicitly considers BAL a weak endogenous biological chemiluminescence formed as a by-product of oxidative metabolism and oxidative stress. General chemiluminescence is typically considered to be random, arising from individual uncorrelated photon emitter molecules [44]. If any correlations in the signal were observed, one would start to ask questions what physical, chemical, and biological processes generate such correlations, hence casting the light on BAL generating mechanisms.
At second, nontrivial statistical properties might revive an interest into long-standing intriguing, yet unresolved question: does BAL enable optical communication between cells and organisms [45][46][47][48]? Underlying hypotheses for such biocommunication role of BAL usually expect that BAL carries information which can be processed by a receiver [46]. Such information could be encoded in the intensity and optical spectrum of BAL [49] or in statistical properties of BAL, if they are any different from random light, as claimed by some authors [45].
At third, statistical properties would represent a completely novel fingerprint for application of BAL in biosensing in biotechnology, agriculture, food industry, and medicine beyond the intensity and optical spectra, hence greatly enhancing application potential of BAL analysis.

Approaches for analysis of BAL statistical properties
Quantum optics approach. Historically, the first common approach to analyze the statistical properties of the photon signals is based on quantum optics theorems and employs photocount statistics of detected photonic signal [50]. Using this approach, several authors suggested that BAL manifests quantum optical coherent properties [21] or even interpreted the observed photocount statistics in terms of quantum optical squeezed states [22,51]. We have recently criticized the interpretation of experimental evidence claiming quantum optical and quantum coherence properties of BAL [23].
Fractal-and chaos-based signal analysis approach. We believe that it is more realistic to consider that BAL could manifest complex statistical or correlated behavior due to the nature of underlying chemical reactions [52] instead of a hypothetical biological coherent quantum field as proposed in the earlier approach. For the analysis of such complex statistical or correlated behavior, fractal or chaos-based methods seem to be appropriate. Therefore, more recent efforts in the analysis of BAL statistical properties were focused on the various measures quantifying the complexity and correlations in the time series such as Hurst exponent [53] and multifractal spectra [54].
Several works found correlations or deviations from purely random process with a trivial properties in the BAL signal [42,43,54]. However, in all those cases, either signals of different signal-to-noise ratio [42,43,54] or surrogate (randomly reshuffled time series) [43] were used as reference signals. Comparing BAL signals having different signal-to-noise (signal = net mean intensity of BAL, noise = mean value of detector noise) ratio may lead to results indicating different statistical properties due to a trivial fact: statistical properties of experimentally detected BAL signal are formed by a convolution of detector noise properties with a pure BAL properties. We demonstrated this issue on the example of Fano factor analysis, see figure 4 in [13].
Using surrogate signals might also lead to misleading interpretation in case the signal contains a certain trivial linear trend before random reshuffling-such reshuffling would eradicate any trend. We showed recently that detrending of the BAL signal is not sufficient to remove artifacts since the trend is present not only in the local mean but also in the local variance of the signal (see figures 1b and 4b in [53]). We suggest that the most reliable testing of the hypothesis of nontrivial correlation properties so far can be obtained using reference signals with well-defined properties. To that end, in our recent works, we used computer-generated Poisson signal time series superposed on the experimentally detected detector dark count times series as the control signals with signal-to-noise ratio same as the experimentally detected BAL signals [53]. Such a method for reference signal generation was also recently used in entropy analysis of BAL from model plant Arabidopsis thaliana and helped to correctly interpret findings of different entropy values at different stages of seed germination, see figure  6 in [20].
For the first time, we combine here the advanced approach of computer-generated reference signals [53] and a novel method based on fractional Brownian motion analysis [55] to test if BAL signals from mung beans manifest any correlations.

Experimental
Preparation of samples. Mung bean seeds (Vigna radiata, BIO Mung, CZ-BIO-001) were used as a biological material. Mung seeds were surface-sterilized with 70% ethanol for 1 min. Then, the ethanol was removed, and 50% disinfecting agent (SAVO, CZ) was added. After 10 min, the seeds were washed with distilled water 6 times and soaked for 6 h (shaken every half an hour). After the preparation, the green covers of the seeds were removed. Then, they were germinated in dark condition on large Petri dishes with ultra-pure water.
Luminescence measurement system. We used a measurement system based on cooled (-30˚C) low-noise photomultiplier tube (PMT) R2256-02 (all components of the system from Hamamatsu Photonics Deutschland, DE, unless noted otherwise), see Fig 1. Cooling unit C10372 (Hamamatsu Photonics Deutschland, DE) consisted of a control panel and a housing in which the PMT is placed. External water cooling is used for lower cooling temperature. High voltage power supply PS350 (Stanford Research Systems, USA) was used for powering the PMT. C9744 (converting) unit, consisting of a preamplifier, discriminator and a pulse shaping circuit, transforms photocount pulses coming from the PMT into 5V TTL pulses detected by C8855 unit connected to PC. Discriminator level was set to -500 mV and high voltage PMT supply to -1550 V based on the experimental SNR (signal-to-noise-ratio) optimization procedure performed in [56], see figure 2.13 therein.
The PMT had a dark count of ca. 17.2 s −1 and photocathode diameter 46 mm); see its quantum efficiency in Fig 1. PMT was mounted from the top outer side of the black light-tight chamber (standard black box, Institute of Photonics and Electronics of the Czech Academy of Sciences, Czechia). The distance between the PMT housing input window and the inner side of the bottom of the Petri dish was 3 cm.
Measurement protocol. The second day after the preparation day, 12 similar mung beans were chosen for the study and distributed into a Petri dish (5 cm in diameter), see   fBm obeys for all t, s > 0 Parameter H is called Hurst exponent, for H = 1/2, the fBm becomes Wiener process, which is standard Brownian motion. There are several cases of time series behaviour: • H ! 1 − as strongly dependent and predictable, • H 2 (1/2; 1) as positive long memory process, • H = 1/2 as Wiener-like process, • H 2 (0; 1/2) as negative long memory process, • H ! 0 + as strongly dependent, but hardly predictable.
Discrete fractional Brownian motion of length N 2 N is any discrete process defined for discrete variable k = 0, . . ., N − 1 with zero mean and autocovariance function defined for k, l = 0, . . ., N − 1 and l < N − k as Taking a sample of fractional Brownian motion, it is possible to investigate short samples of time series with fractional character. Finite sample B H (k) of size N + 1 for k = 0, . . ., N of standardized fBm can be used for the construction of fractional Brownian bridge [55] in the following way In the fractal analysis of time series, the fractional processes are often converted to fractional noises utilizing signal difference to simplify their covariance structure together with its spectral properties keeping the desired dependence on Hurst exponent. The differenced fractional Brownian bridge (dfBB) [55] is defined as for k = 0, . . ., N − 1.
Theory of dfBB. The dfBB is a discrete process and it is proven that the process has zero expected value and its variance is independent on the time lag and equals The autocovariance of dfBB can be expressed as for m = 0, 1, . . .N − 1 where The corresponding autocorrelation function is again independent on the time lag and can be expressed as for m = 0, . . ., N − 1. The autocorrelation function of dfBB for selected H and N = 21 is depicted in Fig 2. This function maps non-negative integer values less than N to the autocorrelation coefficients that will be the key values for subsequent Hurst exponent estimation. The estimation of Hurst exponent will be based on the correlation function (8). This correlation function is valid only for discrete processes that originated as sampling continuous fBm. In our work, we assume that the investigated signals have the fBm property with unknown Hurst exponent. The advantage of using dfBB is the de-trending of the input signal, which is important in the real experiment outcome analysis.
Hurst exponent estimation. The estimation of Hurst exponent is based on the fitting of the autocorrelation function. For an input discrete signal that has the fBm properties, the dfBB according to formulas (3), (4) is created. If the original signal has length N + 1, the respective dfBB has length N having elements x 0 , x 1 , . . ., x N−1 . The estimation of n-th autocovariance coefficientr n can be expressed for n = 0, . . ., N − 1 aŝ in the case of unbiased estimation. Alternative biased estimate is based on formulâ The results using (9) and (10) were proven to be comparable, therefore we used the Eq (9) for the following calculations. Denote the theoretical value of autocorrelation from Eq (8) as ρ n = ρ n (H) and the experimentally calculated autocorrelation asr n . Then we obtain the estimation of parameter H by means of solving the minimization problem where M is the number of signal segments. The point estimate ofĤ was obtained by the maximum likelihood method [57] together with its standard deviationŝ as recommended in [55].

Likelihood ratio test
Having signal from the mung beans photon emission as well as the reference signal, we will use likelihood ratio test [58] to decide, whether the Hurst exponent of both samples is significantly different. We denote H D as the Hurst exponent estimate of the PMT detector noise or reference signal and H B as the Hurst exponent estimate of mung emission using the formula (12). The overall error (sum of the squares of residuals) is defined as where ρ D , ρ B are the autocorrelation coefficient of the noise and photon emission, respectively. The case of j = 0 is excluded due to r D i;0 ¼ r 0 ðH D Þ ¼ 1 for all i = 1, .., M. Using sub-model satisfying H B = H D we get Using likelihood ratio (LR) test of significant difference between the sub-model and the full model, we calculate where L FULL and L SUB are corresponding likelihoods. When the hypothesis H 0 : H D = H B holds, i.e. the full model has the same validity as the submodel, the criterion has w 2 1 distribution due to single parameter constrain.

Reference signal generation.
Recently we demonstrated that a suitable reference signal is crucial to understand and interpret the findings from various BAL signal analysis [20,53]. Detector noise itself is not a suitable reference signal since it contains intrinsic technogenic correlations itself [53] and using signals of other samples with different signal-to-detector noise ratio can also lead to misleading results as we explained in section "Approaches for analysis of BAL statistical properties". Hence for this work we follow our method [53], and generated the reference signal as a sum of measured detector noise and computer-generated Poisson signal (using Matlab 1 2017 poissrnd command) with given λ in every experimental point where y B k and y D k are signal mean values of mung beans and noise, respectively. The respective values of λ in case of 200 μs signal as well as 500 μs signal are calculated in Table 1. Hence, experimentally detected BAL signals from mung beans and reference signals have practically the same mean value and same signal-to-noise ratio.
To sum up, for the analysis, we have three types of signals available: • (B)-mung beans signal y B k , • (D)-noise signal of PMT detector y D k , • (R)-reference signal as a sum of measured detector noise (D) and computer-generated Poisson noise denoted as y R k

Measurement
The investigated sample of germinating mung beans is displayed in the Fig 1. An overview of all signals collected and employed in this paper is in Table 2, where NS denotes the number of available signals. There were two bin size settings used to collect the signals: T s = 200 and 500 μs. For each sampling period, we have corresponding mung bean signals, detector noise signals, and computer-generated reference signals.
Both mung beans signal and PMT detector noise signal are assumed to be stationary with their mean values with the Poisson distribution. Therefore, they can be represented by their mean values E y B k and E y D k that are estimated from the measured data. As previously mentioned, the aim of study is to compare mung beans signal with the reference signal and find statistical difference between them using their autocorrelation. With each of these two signals independently, we performed basic data processing. This procedure describes the normalization of the data, which is the essential property of fBm processes. At first the input time series y k for k = 0, 1, . . ., Q − 1 was cumulatively summed for a window size h 2 N and Anscombe transformation [59] was performed. The resulting signal z k can be expressed based on the output from measuring device y k as for k = 0, . . ., M − 1. This transformations assures stationarity by terms of variance and guarantees Gaussian distribution of the resulting signal.

Hurst exponent estimates
There is no prior knowledge of optimal model length, accumulation compression, and Hurst exponent. Therefore, we will apply the maximum likelihood method of Hurst exponent estimation for the various model and segment lengths, and then we will individually test the differences in the Hurst exponent. However, there is a finite number of reasonable pairs (model length N, segment length h), which will cause the phenomenon of the multiple hypothesis testing. After the False Discovery Rate (FDR) correction, we will localize the model and segment lengths, which cause significant differences in the Hurst exponent. These pairs (h, N) will be declared as significantly sensitive to the signal differences in the Hurst exponent.
Having signals with two different bin sizes, we will use the signal bin size T b = 200 μs as a training set and the signal with T b = 500 μs as a verification set. Normalized mung beans and reference signals with bin size T b = 200 μs and length Q = 500 000 were the subject of the initial analysis. The signal accumulation of size h was applied to the signals, therefore the number of bins was bQ/hc. After the accumulation, the signal is divided into segments of length N. Due to the memory of fBm process, we will use only the odd segments for the calculation of autocorrelation function and the even segments are excluded. The new signal has length dbbQ/hc/Nc/2e. Using Eq (12) and maximum likelihood method, we obtain the corresponding H D and H B estimates for the Hurst exponent of referential signal and mung beans, respectively. Based on these estimates, we can derive the p-values of LR test using (15) statistics.
In our case, we performed altogether 11 × 11 = 121 tests for h = 1500, 1550, . . ., 2000 and N = 20, 21, . . ., 30. Accumulation h could not be higher than 2000 due to the rapid decrease of the number of processed segments. The values h < 1500 caused lower event frequencies and the conversion from Poisson noise to Gaussian noise is not guaranteed. Similar reasons are for the range of parameter N. In fact, the fractional model is less discriminative for N < 20 and the case N > 30 reduces the number of segments. Due to multiple testing and obeying the Hochberg-Benjamini principle, we diminish the significance level from 0.05 to α FDR = 0.000050. The p-values as decadic logarithms are shown in Table 3.
In these settings, there were two cases where the Hurst exponent was significantly different. The results from these two cases are displayed in The lowest p-value was obtained in the case of (h, N) = (1750, 24), which represents the segmentation into bins with duration 1750 × 200 μs = 350 000 μs = 0.35 sec.
As the verification set, the signal with T s = 500 μs was taken into account, following the same procedure as the previous one. The accumulation parameter h was accordingly diminished to 2/5 of its previous value to guarantee the same segment length.
We perform the verification for the combination of signals (B) and (R) similarly as in the previous case and additionally for the combination of (B) and (D). The first set of signals ((B) and (R)) will be used to test if the photon emission is not random and has a negative memory, while the results from the second set ((B) and (D)) of signals will be used to test if there is a significant difference between the cases, when the PMT detects BAL signals from mung beans compared to PMT noise. We use the significant cases from Table 4 to estimate their Hurst exponent and the results on verification set is displayed in Table 5. The variables s 1 , s 2 denote the pair of signals, whereas the H X denotes the estimation of Hurst exponent of the signal s 2 .
We performed four tests, and according to Hochberg-Benjamini false discovery rate, we diminish the α FDR = 0.0169. Therefore, all four cases are considered significant, and we reject the hypothesis that the Hurst exponent of mung beans would be the same as H X .
For comparison, we also performed a similar analysis with noise signal (D) and mungo beans signal (B) and captured the results in the Table 6. Using Hochberg-Benjamini principle, there is only one combination (h,N) = (1850,27) that is significant.
To demonstrate the efficiency of proposed method for short time series, we applied the method of power spectral analysis (PSD) [60,61] for Hurst exponent estimation. The power spectrum P(f) holds following relationship for any frequency f To assure the both PSD and differenced fractional Brownian bridge (DFBB) methods are unbiased, we generated artificial signal sample of length 300 using circular embedding method [62], which is an exact method for fractional Brownian motion generation. The results of this testing are captured in Table 8. While H denotes the Hurst exponent of artificially generated

Discussion
Results from rigorous statistical analysis and testing in Tables 3, 4, and 5 suggest that the mung beans signal has a negative memory (negative correlations, antipersistent behavior [63]) and its Hurst exponent is lower than the referential signal. How could such behavior originate in biological systems? It was proposed that the restriction of Brownian motion due to the structuring of nano-to microscale intracellular environment leads to anomalous subdiffusion [64] characterized by Hurst exponent < 0.5 [63]. This is understandable since a cytoplasm environment displays fractal spatial structuring [65]. Since biochemical reactions (encounters of reactants) leading to BAL are taking place within the cell cytoplasm, organelles and lipid membranes [24] where anomalous sub-diffusion was observed [64,66], it is not a great logical leap to speculate that BAL from mung bean samples could also display sub-diffusive features. Actually, it is already acknowledged that chemical reactions spatially constrained on the microscopic level may lead to fractal reaction kinetics [67][68][69] also in case of intracellular biochemical kinetics [70]. The 0.35 s as the time scale where we found statistically significant differences of mung bean signal Hurst exponent from that of the reference signal (Table 3) could correspond to a rate of underlying rate-limiting step of chemical reactions or processes which give rise to BAL. However, one has to be careful in the interpretation since there are many pitfalls in an accurate estimation of the Hurst exponent value from experiments [71,72]. Although unlikely, given the nature of our experiments we can not fully exclude that the correlations we observe in mung signals are introduced by the photodetector (PMT) due to the nature of photocounting process [73,74]. Introduction of anti-/correlations could be at the physical level of the PMT tube (after-pulsing, a temporary drop of the voltage at dynodes after ejecting electrons, . . .) or the follow-up circuitry (amplifiers). Anti-correlations of the detected counts depending on the count rate have been actually observed due to a PMT construction [74, Fig.9]. However, marked anti-correlations were present only for very high count rates (> kHz) and very low quantum efficiency, which is not the case in our experiments. We also believe that the dead-time of a PMT [75] is not affecting the value of correlations we observe since the PMT dead-time is on the time scale of few hundreds of nanoseconds-several orders of magnitude smaller than the time scale of correlations we observed (0.35 s) and three orders of magnitude smaller than our bin size (200 and 500 μs). Throughout the analysis, the lower limit for accumulation parameter h was chosen as 1500 to assure the normality of the processed data due to the sparsity of the input signal. Higher accumulation than 2000 is not useful since then we would lose the precision of estimate due to the short length of investigated time series. The minimal length of signal segment N was chosen to assure consistency of the used model, segment lengths of N > 30 do not significantly contribute to the higher precision of estimate [40].

Conclusion
In this work, we focused on statistical properties of biological autoluminescence from germinating mung bean sample. Our emphasis was on the development of a rigorous mathematical and statistical methodology, which takes into account proper reference signals, likelihood ratio test, and multiple hypothesis testing effects.
We used a highly sensitive photomultiplier-based detection system to record a time series of photon counts of the mung bean sample emission and noise of the detector. Using the normalization of the input signal, we were able to employ the fractional models that allowed us to estimate Hurst exponent. Dividing the input signals into the training set and evaluating the differences in the Hurst exponent of both signals, the procedure allowed us to test our initial hypothesis on the verification signal. The resulting Hurst exponent mean value of mung bean sample time series is below the level of 1/2 which confirmed our initial hypothesis, that the biological autoluminescence displays correlations. We also proposed that this value could be related to anomalous sub-diffusive features of biochemical reactions underlying processes within mung beans, which give rise to photon emission time series. Further extensive work beyond the scope of this methodical paper needs to be carried out to test the biological ubiquity of anti-/correlations in biological autoluminescence signals and the role of the detector in the observed Hurst exponent values. Especially interesting would be an analysis of BAL statistical properties across samples with rising complexity starting from simple chemical solutions of small biomolecules through isolated cellular structures and cell suspensions up to whole tissues and organisms. Nevertheless, we believe that rigorous methodology we presented here will help to support the future research of BAL statistical properties towards a deeper understanding of BAL mechanisms as well as applications for label-free and non-invasive analysis in medicine and biotechnology using completely new signal fingerprint types.
Supporting information S1 Data. All raw data files used in our analysis. (ZIP)