Wavelet analysis of oximetry recordings to assist in the automated detection of moderate-to-severe pediatric sleep apnea-hypopnea syndrome

Background The gold standard for pediatric sleep apnea hypopnea syndrome (SAHS) is overnight polysomnography, which has several limitations. Thus, simplified diagnosis techniques become necessary. Objective The aim of this study is twofold: (i) to analyze the blood oxygen saturation (SpO2) signal from nocturnal oximetry by means of features from the wavelet transform in order to characterize pediatric SAHS; (ii) to evaluate the usefulness of the extracted features to assist in the detection of pediatric SAHS. Methods 981 SpO2 signals from children ranging 2–13 years of age were used. Discrete wavelet transform (DWT) was employed due to its suitability to deal with non-stationary signals as well as the ability to analyze the SAHS-related low frequency components of the SpO2 signal with high resolution. In addition, 3% oxygen desaturation index (ODI3), statistical moments and power spectral density (PSD) features were computed. Fast correlation-based filter was applied to select a feature subset. This subset fed three classifiers (logistic regression, support vector machines (SVM), and multilayer perceptron) trained to determine the presence of moderate-to-severe pediatric SAHS (apnea-hypopnea index cutoff ≥ 5 events per hour). Results The wavelet entropy and features computed in the D9 detail level of the DWT reached significant differences associated with the presence of SAHS. All the proposed classifiers fed with a selected feature subset composed of ODI3, statistical moments, PSD, and DWT features outperformed every single feature. SVM reached the highest performance. It achieved 84.0% accuracy (71.9% sensitivity, 91.1% specificity), outperforming state-of-the-art studies in the detection of moderate-to-severe SAHS using the SpO2 signal alone. Conclusion Wavelet analysis could be a reliable tool to analyze the oximetry signal in order to assist in the automated detection of moderate-to-severe pediatric SAHS. Hence, pediatric subjects suffering from moderate-to-severe SAHS could benefit from an accurate simplified screening test only using the SpO2 signal.


Introduction
The American Academy of Pediatrics (AAP) defines pediatric sleep apnea-hypopnea syndrome (SAHS) as a breathing disorder characterized by recurrent episodes of complete cessation (apnea) and/or significant reduction (hypopnea) of airflow during sleep [1]. SAHS is a highly prevalent condition among children (in the range of 1% to 5%) that may lead to many adverse consequences on the overall health and quality of life, such as cognitive deficits, behavioral abnormalities, sleepiness, systemic inflammation, and cardiac and metabolic derangements [1].
The gold standard technique for pediatric SAHS diagnosis is overnight polysomnography (PSG). It involves recording a wide range of biomedical signals in a specialized sleep laboratory [2,3]. These recordings are used to score apneas and hypopneas in order to compute the apnea-hypopnea index (AHI), defined as the number of apneas and hypopneas per hour (e/h) of sleep. AHI is the clinical variable used to establish SAHS. The diagnosis of moderate-tosevere pediatric SAHS is confirmed when they present an AHI �5 e/h, irrespective of other co-morbidities [1]. These children are at increased risk of suffering from the major negative consequences of the disease [3][4][5]. Thus, to expedite the diagnosis and treatment is essential in these patients. In this sense, surgical treatment with adenotonsillectomy is consistently recommended for children suffering from SAHS with an AHI �5 e/h [6]. This treatment leads to an improvement in the condition in the majority of pediatric patients who suffer from moderateto-severe childhood SAHS [1]. However, in spite of the PSG serving as the current recommended diagnostic gold standard, it is costly and complex due to the necessary equipment and trained staff, as well as highly intrusive due to the use of multiple sensors. In addition, it is a time-demanding method that shows limited availability and absent scalability, thereby delaying the diagnosis and treatment of SAHS patients [7,8].
These drawbacks have led to extensive exploration of the use of simplified diagnostic techniques [9,10]. One common approach is the analysis of a reduced set of cardiorespiratory signals involved in PSG. In this regard, overnight oximetry is a common alternative due to its reliability, simplicity, and suitability for children [7,11]. Nocturnal oximetry records the blood oxygen saturation (SpO 2 ) signal, which provides a numerical measure of the oxygen content in hemoglobin [12]. Apneic events result in decreases in blood oxygen levels and such events are termed oxyhemoglobin desaturations [12]. Hence, the SpO 2 signal contains useful information to detect pediatric SAHS. Previous studies have shown the usefulness of automated analysis of the SpO 2 signal from nocturnal oximetry to assist in the screening of moderate-to-severe pediatric SAHS [13][14][15][16][17][18][19]. However, the results obtained in these studies indicate that an accurate diagnosis of pediatric SAHS is difficult, and in fact, substantially more difficult than in adults, particularly because the frequency of apneic events and reductions in SpO 2 is markedly lower in children. Thus, further scientific evidence is still necessary before the diagnostic ability of the SpO 2 signal can be widely implemented as a pragmatic tool to assist in an automated detection of childhood SAHS. Different signal processing techniques have already been applied to characterize the changes produced in the SpO 2 signal as elicited by apneic events. Conventional oximetry indices, statistical measures, nonlinear parameters, and spectral analysis from the SpO 2 recordings have all been evaluated [13][14][15][16][17][18][19]. Among these approaches, the use of spectral analysis is a common choice due to the recurrence of apneic events. In this sense, previous studies have assessed features extracted from the power spectral density (PSD) and bispectrum [17][18][19]. However, these methods are based on the Short-Time Fourier Transform (STFT), thus having a fixed time-frequency resolution [20]. In contrast, wavelet transform (WT) offers high frequency resolution at low frequencies as well as high time resolution at high frequencies [20,21]. This property makes WT a potentially more suitable technique to accurately detect low frequency components, such as those associated with the duration of SpO 2 desaturations. Additionally, WT is also suitable to analyze non-stationarities like those occurring in the SpO 2 signal by apnea-hypopnea events. In this sense, wavelet analysis has proven its usefulness to detect changes produced in biomedical signals by apneic events among adult SAHS patients [22][23][24][25][26][27]. Nevertheless, only two single preliminary studies by our group evaluated the usefulness of the wavelet analysis in the detection of pediatric SAHS using the SpO 2 signal [28,29]. Therefore, additional research is clearly needed to further corroborate previous findings in a small cohort and to assess the usefulness of wavelet analysis of SpO 2 in the diagnosis of pediatric SAHS. Thus, we propose to develop a more exhaustive wavelet analysis with a larger database of 981 overnight SpO 2 recordings.
We hypothesized that the multiresolution analysis afforded by the WT could provide a set of useful features to precisely characterize changes occurring in the SpO 2 signal associated with pediatric SAHS. Consequently, the aim of this study was twofold: (i) to analyze oximetry dynamics by means of WT-derived features in order to characterize differences associated with the presence of SAHS; and (ii) to assess the usefulness of these features to assist in an automated detection of moderate-to-severe pediatric SAHS.

Subjects and signals under study
The database is composed of 981 pediatric subjects (602 males and 379 females) ranging from 2 to 10 years of age. All children were referred to the Pediatric Sleep Unit at the University of Chicago Medicine-Comer Children's Hospital (Chicago, IL, USA) in the context of clinical suspicion of SAHS. All legal caretakers of the children gave their informed consent as a prerequisite to be part of the study and the Ethics Committee of the hospital approved the protocols (#11-0268-AM017, # 09-115-B-AM031, and # IRB14-1241).
Children's sleep was monitored using a digital polysomnography system (Nihon Kohden America Inc., CA, USA). SpO 2 recordings were acquired during overnight polysomnography at sampling rates of 25, 200, or 500 Hz. In a preprocessing stage, artifacts were removed by discarding those SpO 2 values below 50% and those intervals with a slope higher than 4%/s [30]. Then, SpO 2 recordings were resampled to a common rate of 25 Hz, as recommended by the American Academy of Sleep Medicine (AASM) [12], and were rounded to the second decimal place in order to have the same resolution [31]. The guidelines of the AASM were used by a certified pediatric sleep specialist to quantify sleep and cardiorespiratory events. The AHI was subsequently derived in order to diagnose pediatric SAHS. An AHI of 5 e/h was the threshold used to establish moderate-to-severe SAHS because of the enhanced risk of morbidity and thus the importance of an early detection and treatment in these cases. According to this AHIbased cutoff, 405 children were in the group AHI �5 e/h, whereas 576 children were in the group AHI <5 e/h.
The dataset was randomly divided into an optimization set (60%) and a cross-validation set (40%) [19]. Table 1 shows demographic and clinical data of the population under study (median [interquartile range] or n (%)). No statistically significant differences (p-value < .01) emerged in either age or body mass index (BMI) between optimization and cross-validation groups.

Methods
Our methodology is divided into three steps: feature extraction, selection, and classification. In the first step, the wavelet transform was applied to analyze each SpO 2 signal. A set of features was computed using the discrete wavelet transform (DWT) to characterize the changes produced in SpO 2 recordings due to SAHS. In addition, 3% oxygen desaturation index (ODI3), statistical moments in the time domain and PSD features, which are common features from the SpO 2 signal [17,19], were obtained to compose a wide initial feature set with relevant as well as complementary information. In the second step, a feature subset was selected using the fast correlation-based filter (FCBF) method [32]. Finally, binary logistic regression (LR) [33], support vector machines (SVM) [34] and multi-layer perceptron (MLP) neural network [35] classifiers were trained using this selected feature subset in order to detect moderate-to-severe pediatric SAHS. Fig 1 shows the validation approach employed in each methodological step. The first set (optimization set) was employed to perform descriptive analysis of the extracted features, select a subset of features with FCBF, and select the optimal design parameters of the SVM and MLP classifiers. Bootstrapping has been employed in the feature selection stage, in order to avoid overfitting [36]. In the same way, 10-fold stratified cross validation has been applied to optimize the design parameters of SVM and MLP. The second set (cross-validation set) was used to evaluate the diagnostic performance of the single features and classifiers. Stratified Kfold cross validation (K = 5) was applied for this purpose [37].
Feature extraction. Discrete Wavelet Transform WT can be seen as the decomposition of a signal x(t) onto a set of basis functions, called wavelets [20]. Wavelets are obtained by time translations and scaling of a unique function called the mother wavelet. WT can be seen as an extension of the Fourier transform where, instead of analyzing a single scale, a multiscale analysis is performed. This multiscale property of the WT allows decomposing a signal into a set of scales, where each scale analyzes a different frequency range of the signal. WT can be continuous (Continuous Wavelet Transform, CWT) or discrete (DWT), depending on the scale and translation values [20]. CWT computes WT for each scale, whereas DWT only computes WT for dyadic (power of 2) scales, thus presenting lower complexity and higher computational efficiency than CWT [38]. Consequently, DWT was chosen in this study. In addition, it has previously shown its usefulness to detect different frequency components in physiological signals associated to SAHS events in adult patients [22][23][24][25][26][27]. Fig 2 shows how DWT is computed. In Fig 2A, the decomposition process of a SpO 2 signal x[n] using DWT, the so-called subband coding scheme, is illustrated. It is a filter-bank tree where each stage consists of a high pass-filter g[n] (the mother wavelet) and a low pass filter h [n] (the mirror version of the mother wavelet), followed by a subsampling process of factor two [20]. The relationship between these two filters is as follows [20]: where L, an even number, is the length of the filter. First, x[n] is decomposed in an approximation signal (lowpass version), A 1 , and a detail signal (highpass version), D 1 . Then, A 1 is further decomposed into another approximation signal, A 2 , and another detail signal, D 2 . Each iteration increases the frequency resolution of the approximation and the detail version by two, as well as decreases the number of samples of both approximation and detail signals. This process continues until the maximum detail level of the signal, N = log2(M) is reached, being M the where A i-1 is the approximation signal in the level i-1. In the level 1, A 0 is the original signal x DWT was applied to the upper power of 2 for 5 minute segments (M = 2 13 samples (5.5 minutes)) and, consequently, N = 13 [23]. In this study, the Haar wavelet was chosen as mother wavelet. The reason is twofold [27]: (i) its suitability for picking up abrupt changes, which is appropriate to detect the changes produced in the SpO 2 values due to apneic events; and (ii) its smoothing feature, which does not distort the original form of the SpO 2 signal. At each level of the decomposition, detail coefficients contain information about a different frequency band, as stated in Fig 2B. We focused on the detail coefficients of the level 9 (D 9 , i.e., 0.0244-0.0488 Hz), since it is the level which is contained in the band of interest previously related to the recurrence of apneic events [18]. SpO 2 signal presents both drops and rises associated to apneic events, which result in decreased and increased values in D 9 coefficients, respectively. The information contained in the D 9 coefficients may be canceled due to the presence of both positive and negative values, such as mean or skewness. To avoid this, the absolute values of the DWT coefficients were used. The following seven features were extracted from the DWT coefficients: • Statistical moments of the D 9 coefficients (Mean (M1 D9 ), variance (M2 D9 ), skewness (M3 D9 ) and kurtosis (M4 D9 )). M1 D9 -M4 D9 measure the central tendency, dispersion, asymmetry and peakedness of the data, respectively.
• Maximum amplitude of the D 9 coefficients (Max D9 ). It quantifies the highest amplitude in this frequency band.
• Energy of the D 9 coefficients (En D9 ). It measures the averaged quadratic amplitude of the signal in D 9. It is computed as follows: • Wavelet Entropy (WE), which measures the irregularity introduced in the DWT. It was extracted in order to obtain information about the changes produced in the energy distribution of the different detail levels of the DWT of the SpO 2 signal by apneic events [39]. It is computed as follows: where p i is the relative wavelet energy at the detail level D i : Where En Di is the wavelet energy at the detail level D i :

Conventional features from the SpO 2 signal
In order to enhance the diagnostic ability of our proposal, the following features, that are common parameters of the oximetry signal [17,19], were computed: • ODI3. It was estimated as the number of desaturations of at least 3% from preceding baseline per hour of recording [40]. This parameter has shown its usefulness in clinical studies, even though it underestimates AHI [13][14][15].
• PSD features. PSD was estimated using the Welch's method (2 13 -sample Hamming window, 50% overlap and 2 14 -points DFT) [41]. The following features were obtained: first-to-fourth order statistical moments (M1 PSD -M4 PSD ) and maximum amplitude (Max PSD ) from the band of interest determined in [18] (0.018-0.050 Hz) and spectral entropy (SE PSD ) in the full spectrum. These features provide information about the recurrence and duration of apneic events.
Feature selection: Fast Correlation-Based Filter (FCBF). The FCBF method was applied to select a subset of relevant and non-redundant features [32]. FCBF is a feature selection algorithm that has previously shown its usefulness in the context of pediatric SAHS [18,19]. First, FCBF computes the symmetrical uncertainty (SU) between each feature (x i ) and the AHI (y). SU is a normalization of the information gain between two variables. SU is computed as follows [32]: where IG(x i |y) = H(x i )-H(x i |y), N is the total number of features extracted and H refers to Shannon's entropy [32]. According to their SU value (between 0 and 1), features are ranked from the most relevant (highest SU with the AHI) to the least relevant one (lowest SU with the AHI). Then, a redundancy analysis is performed. SU between each pair of features (x j , x i ) is computed. Features x j sharing more information with a more relevant one than with the AHI (SU (x j | x i ) � SU(x j |y)) were discarded. Finally, an optimum subset composed of the features not discarded in this process is obtained. A bootstrap approach was employed in order to obtain a subset of features independent of a particular dataset. In this regard, FCBF was applied to 1000 bootstrap replicates built with a sample with replacement procedure from the optimization set [42,43]. Those variables that were selected with FCBF more than 500 times (50% of runs) formed the feature subset [18,19].
Feature classification. In this study, we employed LR, SVM, and MLP, which are wellknown algorithms in the context of binary classification. Particularly, these algorithms were applied to assign each subject to the groups AHI <5 e/h and AHI �5 e/h [33][34][35].
Logistic regression LR is a standard machine learning approach for binary classification. Given a set of input features, LR estimates the posterior probability of a given instance (subject) belonging to one of two mutually exclusive groups (AHI <5 e/h and AHI �5 e/h) by the use of the logistic function [33]: where C l represents the two groups (AHI <5 e/h and AHI �5 e/h), β = β 0 , β 1 , . . ., β N are the coefficients of the model for each input feature, x k = x 1,k , . . ., x N,k , is the input pattern for the instance k, and N is the number of features. A Bernoulli distribution is used to model the probability density function and β coefficients are optimized using the maximum likelihood ratio [33].

Support vector machines
A SVM is a binary classifier that searches for the best hyperplane that separates instances from the classes under study [34]. The hyperplane has the following expression [34]: where x � R N is the input pattern of dimension N (number of features), ɸ(x) � R P transforms the data into a high-dimensional space P>N, and w is the weight vector. The weight vector w is optimized in order to maximize the margin of separation between the two groups [34]. A regularization parameter C was applied to control the trade-off between maximizing the margin of separation between groups and obtaining a good generalization ability in an independent set [34]. The optimization problem of SVM is formulated using Lagrange multipliers: where S is a subset of the indices {1, . . ., L} corresponding to the non-zero Lagrange multipliers (support vectors) η i , L is the number of observations in the training set, t i are the output labels (±1 for the AHI �5 and AHI <5 e/h groups), and K(�,�) is the kernel function in the transformed space. In this study, a linear kernel was used, which has previously shown its usefulness in the context of adult SAHS [44]. The value of C was optimized by means of 10-fold cross-validation using the optimization set.

Multi-layer perceptron neural network
A MLP is an artificial neural network arranged in several fully connected layers: input, hidden, and output layers [35]. These layers are composed of computing units called perceptrons or neurons. Each neuron consists of an activation function g k {.} and adaptive weights w kj that interconnect the neuron with neurons from the subsequent layer [35]. The input layer was composed of one neuron for each input feature. Additionally, a configuration with one single hidden layer with a hyperbolic tangent activation function was applied since it provides a fast convergence for the training algorithm [35]. This configuration can provide universal approximation to any continuous function with the only condition that there are enough hidden units [35,45]. Finally, two neurons composed the output layer, since our problem is a binary classification task. A logistic sigmoid activation function has been used in the output layer, because it allows the output neurons to be interpreted probabilistically [35]: where y k are the outputs neurons, w kj are the weights connecting the hidden layer to the output layer, w ji are the weights connecting the input layer to the hidden layer, b j and b k are the bias associated to the hidden and the output units, respectively, x i is the feature i, g k {.} and g j {.} are the activation functions of the output and hidden layer, respectively, N H is the number of neurons in the hidden layer, and N is the number of input features [35]. Random initialization was performed for the weights of the network. Then, the scaled conjugate gradient algorithm with weight-decay regularization was used to optimize the weights [35]. N H and the regularization parameter (α) were optimized by means of 10-fold cross-validation using the optimization set. Statistical analysis. The software tools Matlab version R2017a was used for performing signal processing and statistical analyses. Normality and homoscedasticity tests showed that extracted parameters were not normality distributed and had different variances. Consequently, the Mann-Whitney U test was applied to search for statistical significant differences in the extracted features (p-value <0.01) between groups. Diagnostic performance was assessed by means of sensitivity (Se, percentage of patients with an AHI �5 e/h correctly classified), specificity (Sp, percentage of children with an AHI <5 e/h correctly classified), positive predictive value (PPV, proportion of subjects classified as positive that are true positives), negative predictive value (NPV, proportion of subjects classified as negative that are true negatives), positive likelihood ratio (LR+, likelihood ratio for subjects classified as positive), negative likelihood ratio (LR-, likelihood ratio for subjects classified as negative), and accuracy (Acc, percentage of subjects correctly classified).
K-fold stratified cross validation (K = 5) was applied to assess the performance of the extracted features and the binary classifiers [37]. The cross-validation set was randomly divided into K subsets, preserving the proportion of subjects belonging to the groups AHI <5 e/h and AHI �5 e/h. K-1 folds formed the training folds (80% of the cross-validation set), whereas the remaining one formed the test fold (20% of the cross-validation set). Accordingly, Receiver Operating Characteristics (ROC) curves were used to obtain optimum classification cut-off points for the single features using the K-1 training folds. Similarly, the classification algorithms were trained using the training folds. Then, the diagnostic performance of the single features and the LR, SVM, and MLP classifiers was measured using the test fold. This process was repeated K times, so each fold was considered once as the test fold. Finally, all the metrics are averaged across the K = 5 iterations.

Feature separability
A total of seven DWT-derived features were obtained for each SpO 2 recording (S1 Table). Fig  3 shows the histogram of the D 9 coefficients in the optimization set for the groups AHI <5 e/h and AHI �5 e/h. It can be observed that D 9 coefficients are more concentrated near zero in the AHI <5 e/h group, whereas in the group AHI �5 e/h these coefficients are more disperse. Table 2 shows the median and interquartile range of all these extracted features in the optimization set for both groups. All features showed significant statistical differences (p-value <0.01) between groups. M1 D9 , M2 D9 , Max D9 , En D9 , and WE showed higher values in the AHI �5 e/h group, whereas M3 D9 and M4 D9 showed higher values in the AHI <5 e/h group. ODI3, statistical moments and PSD features were also computed for each SpO 2 recording (S1 Table). ODI3, 3 out of 4 statistical moments (M1 T , M2 T , and M3 T ) and 3 out of 6 spectral features (M1 PSD , M2 PSD , and Max PSD ) also showed significant statistical differences (p-value <0.01), which agrees with previous studies [17,18]. Optimum feature subset FCBF was applied to each bootstrap replicate from the optimization set, each one composed of all the extracted features (ODI3, statistical moments, PSD, and DWT features). ODI3, 1 statistical moment (M2 T ), 3 features from PSD (M2 PSD , M3 PSD , and Max PSD ), and 3 DWT-derived features (M3 D9 , En D9 and WE) were selected more than 50% of times (500) (S2 Table). Thus, these features formed the selected feature subset [18,19]. Notice that features from all the different methodological approaches were selected.

Classification models optimization
LR, SVM, and MLP classifiers were designed using the selected feature subset obtained with FCBF (ODI3, M2 T , M2 PSD , M3 PSD , and Max PSD , M3 D9 , En D9 , and WE). Optimum values for the design parameters of the SVM (regularization parameter: C) and MLP classifiers (number of neurons in the hidden layer: N H ; regularization parameter: α) were obtained as those for which the Acc of the classifiers was the highest in the optimization set. Concerning SVM, the following values of C were assessed: 10 −5 , 10 −4 , 10 −3 , . . ., 10 4 , 10 5 . The optimum value of the input parameter C was 10 3 , which maximizes Acc. Regarding MLP, N H was varied from 2 up to 50 and α was varied from 0 up to 10. Since the network depends on the initial random values of the weights, the accuracy was computed and averaged for a total of 10 runs for each pair N H -α. Finally, user-dependent network parameters N H = 5 and α = 1 were chosen since this pair reached the highest accuracy.

Diagnostic performance
The value of all the extracted features (ODI3, statistical moments, PSD, and DWT features) and the classification score of the LR, SVM, and MLP classifiers were obtained for each subject in the cross-validation set (S3 Table). Table 3 shows the diagnostic ability of each single feature in the cross-validation set obtained using optimum cut-off point obtained from the ROC curve. Most of the DWT-derived features (5 out of 7) showed accuracies near 80%. In this regard, Max D9 achieved the highest performance (81.7±5.6% Acc, with 75.4±7.1% Se and 85.4 ±6.8% Sp), outperforming statistical moments and PSD features. Only ODI3 achieved slightly higher Acc than Max D9 , reaching 81.9±7.2% Acc (78.1±7.3% Se and 84.2±8.1% Sp). Table 4 shows the diagnostic performance of LR, SVM, and MLP classifiers, which were trained using the selected feature subset (ODI3, M2 T , M2 PSD , M3 PSD , and Max PSD , M3 D9 , En D9 , and WE) obtained with FCBF, in the cross validation set. These classifiers showed high diagnostic performance, outperforming all the extracted features in terms of Sp, PPV, LR+, and Acc. SVM achieved the highest accuracy (84.0±5.2% Acc, with 71.9±4.4% Se and 91.1±7.2% Sp) for the cutoff of 5 e/h.

Discussion
In the present study, we examined the usefulness of wavelet analysis to identify features that characterize oximetry dynamics in order to expedite detection of moderate-to-severe pediatric SAHS. WE and features from the coefficients in D 9 (M1 D9 -M4 D9 , Max D9 , and En D9 ) were obtained from the DWT of each SpO 2 recording. D 9 (0.0244-0.0488 Hz) was chosen according to a previous study in the context of pediatric SAHS [18], and is related to the duration and frequency of the SpO 2 desaturations associated with apneic events [40]. Statistically significant differences (p-value < 0.01) emerged in all DWT-derived features between the groups AHI <5 e/h and AHI �5 e/h in the optimization set ( Table 2). The higher values showed by M1 D9 , Max D9 , and En D9 in the AHI �5 e/h group agree with a higher amplitude of the histogram for high values of the D 9 coefficients in this group. In addition, the SpO 2 drops and rises caused by apneic events are reflected in a higher dispersion in the histogram of D 9 coefficients, as reported by the higher values of M2 D9 in the AHI �5 e/h group. In contrast, the lower values that M3 D9 and M4 D9 as reflected in the AHI �5 e/h group indicate that the variations produced in the SpO 2 signal due to apneic events result in values less proximal to zero in the histogram of the D 9 coefficients. Finally, the higher irregularity reported by WE in the SAHS positive group suggests that apnea-hypopnea events alter the energy distribution of the whole DWT profile of the SpO 2 signal. Table 3. Diagnostic ability of the proposed features (ODI3, statistical moments, PSD, and DWT) in the cross-validation set.

Feature Se (%) Sp (%) PPV (%) NPV (%) LR+ LR-Acc (%)
Regarding the diagnostic performance of the proposed features, ODI3 and Max D9 reached similar Acc in the cross-validation set, higher than the remaining features. In addition, higher accuracies were generally obtained with the DWT-derived features with respect to statistical moments and features from PSD. This suggests that DWT is a useful approach to analyze the changes produced in the SpO 2 signal associated to SAHS. In the feature selection stage, a feature subset composed of ODI3 (conventional oximetric index); M2 T (time); M2 PSD , M3 PSD , and Max PSD (PSD), and M3 D9 , En D9 y WE (DWT) was obtained with FCBF. LR, SVM, and MLP models built with this subset obtained high diagnostic performance for the detection of moderate-to-severe SAHS (AHI �5 e/h), improving the diagnostic ability of the single features ( Table 3) in terms of Sp, PPV, LR+, and Acc. It is worthy to note that the SVM model achieved the highest average Acc (84.0%), Sp (91.1%), PPV (83.8%), and LR+ (14.6) among the single features and binary classifiers. In addition, SVM reached similar NPV and LR-to LR, MLP, ODI3 and the remaining features. A high LR+ is especially important for screening tests [17,46]. In this sense, a LR+ greater than 10 is considered to provide strong evidence to confirm diagnoses [46]. Thus, our method is especially useful to confirm the presence of pediatric SAHS.
Three DWT features were involved in the feature subset obtained with FCBF: M3 D9 , En D9 and WE. As aforementioned, these features provide information about the concentration of the D 9 coefficients near zero (M3 D9 ), the amplitude of the D 9 coefficients (En D9 ), and the irregularity of the distribution of the whole DWT profile of the SpO 2 signal (WE). According to our results, M3 D9 , En D9 and WE provide both relevant and complementary (non-redundant) information on the changes occurring in the SpO 2 signal due to SAHS. This is consistent with the different properties of the SpO 2 signal these DWT-derived features quantify. The fact that a high performance was reached with the three classification algorithms reinforces the notion that DWT is a useful method to analyze the SpO 2 signal in the context of pediatric SAHS.
To the best of our knowledge, this is the first study assessing wavelet analysis of SpO 2 recordings in the context of pediatric SAHS. Our results suggest that DWT is an appropriate tool to analyze the low frequency components of the SpO 2 signals related to the duration of the desaturations caused by apnea-hypopnea events since it provides high resolution at low frequencies of the power spectrum [20,21]. This assumption is further supported by previous studies, whereby DWT was also applied to quantify the frequency components of different biomedical signals associated to respiratory events in the context of adult SAHS [23,24]. Additionally, the favorable performance of our approach may be due to the suitability of the WT to analyze non stationary properties of a signal [20,21], which is appropriate to events such as the non-stationary changes of the SpO 2 signal associated with apneic events. The high resolution afforded by WT at low frequencies, as well as its suitability to analyze non-stationary signals clearly support the contention that DWT is more appropriate than conventional spectral analysis techniques to analyze the SpO 2 signal [20,21]. Table 5 shows the performance of previous studies focused on the automated analysis of SpO 2 as an alternative to PSG in the screening of moderate-to-severe pediatric SAHS [13][14][15][16][17][18][19]. Oxygen desaturation index and clusters of desaturations have been employed for this task [13][14][15][16]. Kirk et al. [13] applied ODI3, reaching 67.0% Se, 60.0% Sp, and 64.0% Acc. Tsai et al. [14] obtained 83.8% Se, 86.5% Sp, and 85.1% Acc using 4% ODI (ODI4). However, ODI4 cutoff values were optimized and validated using the same population, such that no true post-hoc verification was achieved. Chang et al. [15] combined ODI3 with common symptoms to assess a discriminative score, reaching 60% Se, 86% Sp, and 72% Acc. Pia-Villa et al. [16] reported 69.4% Acc (40.6% Se and 97.9% Sp) combining clusters of desaturations and clinical history in a discriminative score. Our approach achieved a high the diagnostic performance while also strengthening its validity since the methods were derived using not only a much larger sample size, but also applying a cross validation approach to validate the results.
In order to increase the diagnostic ability of the SpO 2 signal, conventional oximetric indices have been combined with features from other signal processing approaches in studies developed by our group [17][18][19]. Á lvarez et al. [17] assessed LR models fed with conventional oximetric indices, statistical parameters, PSD, and nonlinear features. These models were validated using a bootstrap procedure, reaching 82.8% Acc (82.2% Se and 83.6% Sp). Vaquerizo-Villar et al. [18] assessed the usefulness of oximetry bispectrum. A multiclass multi-layer perceptron (MLP) model fed with ODI3, anthropometrical variables, PSD, and bispectral features reached 61.8% Se, 97.6% Sp, and 81.3% Acc in an independent test set, outperforming a MLP classifier built without bispectral features. Finally, Hornero et al. [19] analyzed 4,191 SpO 2 recordings obtained from 13 sleep laboratories in a multicenter international study. A MLP regression model with ODI3 and the skewness of the PSD reached 68.2% Se, 87.2% Sp, and 81.7% Acc. In contrast with the findings of these studies, our current results achieved improved diagnostic ability for the screening of moderate-to-severe SAHS with the use of DWT-derived features. This suggests that wavelet analysis could enhance the detection of this clinically important and vulnerable group of SAHS severity from single-channel oximetry recordings. In these patients, it is essential to early detect this condition, since they are more likely to suffer from morbidities such as decreases in cognitive performance [3,4], as well as an increased C-reactive protein level due to systemic inflammation [5]. Moreover, an AHI �5 e/h is also associated with increased systemic blood pressure measurements and an increased risk for cardiac strain [3]. All these important negative consequences highlight the necessity of an early detection of moderate-to-severe pediatric SAHS [3].
Notwithstanding the highly promising results of our current approach, several limitations must be considered. First, the exclusive use of the SpO 2 signal to detect SAHS may restrict the spectrum of physiological perturbations being detected by the oximetry signal, such as electroencephalographic arousals or reductions in airflow and increased intrathoracic pressure swings [1]. In this regard, the combination of SpO 2 with other physiological signals from PSG could potentially enhance the performance of our proposed method but at the cost of adding significant complexity to the test. In addition, future research efforts may prospectively focus on identifying a specific mother wavelet for this task. However, our proposed approach achieved high performance with the Haar's mother wavelet. Of note, the lack of universally accepted AHI severity cutoffs is another limitation that affects our study. Nevertheless, we have assessed the diagnostic ability of our proposal using an AHI cutoff of 5 e/h, a widely used criterion in the clinical decision making leading to the recommendation of surgical treatment [3,10]. Finally, it would be an interesting future goal to further validate our methodology in a larger sample of unattended oximetry recordings obtained at patients' homes.

Conclusions
The application of WT has enabled the identification of features with the ability to characterize the effects of SAHS in the overnight oximetry profile of children. Features computed in the D 9 detail level of the DWT as well as WE reached significant differences associated with the presence of SAHS. DWT has been found to provide complementary information to conventional approaches. Additionally, high diagnostic performance was reached using different reference binary classifiers, which emphasizes the usefulness of the DWT to provide discriminant information from oximetry signals. These results suggest that wavelet analysis could be useful to further characterize the oximetry signal and improve the diagnostic performance and implementation of abbreviated screening test for pediatric SAHS.