Figures
Abstract
Electrocardiogram (ECG) signals are crucial in diagnosing cardiovascular diseases (CVDs). While wavelet-based feature extraction has demonstrated effectiveness in deep learning (DL)-based ECG diagnosis, selecting the optimal wavelet base poses a significant challenge, as it directly influences feature quality and diagnostic accuracy. Traditional methods typically rely on fixed wavelet bases chosen heuristically or through trial-and-error, which can fail to cover the distinct characteristics of individual ECG signals, leading to suboptimal performance. To address this limitation, we propose a reinforcement learning-based wavelet base selection (RLWBS) framework that dynamically customizes the wavelet base for each ECG signal. In this framework, a reinforcement learning (RL) agent iteratively optimizes its wavelet base selection (WBS) strategy based on successive feedback of classification performance, aiming to achieve progressively optimized feature extraction. Experiments conducted on the clinically collected PTB-XL dataset for ECG abnormality classification show that the proposed RLWBS framework could obtain more detailed time-frequency representation of ECG signals, yielding enhanced diagnostic performance compared to traditional WBS approaches.
Citation: Xiao Q, Wang C (2025) Adaptive wavelet base selection for deep learning-based ECG diagnosis: A reinforcement learning approach. PLoS ONE 20(2): e0318070. https://doi.org/10.1371/journal.pone.0318070
Editor: Paul-Adrian Calburean, George Emil Palade University of Medicine PharmacyScience and Technology of Targu Mures: Universitatea de Medicina FarmacieStiinte si Tehnologie George Emil Palade din Targu Mures, ROMANIA
Received: November 7, 2024; Accepted: January 9, 2025; Published: February 3, 2025
Copyright: © 2025 Xiao and Wang. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data are freely available from the PTB-XL database published in PysioNet which is recognized by PLOS as established repository. The URL for the data is: https://physionet.org/content/ptb-xl/1.0.3/ or 10.13026/kfzx-aw45
Funding: This study is supported by Scientific Research Fund of Hunan Provincial Education Department (21C0304). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
According to a report issued by the American Heart Association [1], cardiovascular diseases (CVDs) emerge as the leading cause of mortality worldwide, with the number of individuals affected by CVDs projected to rise to 23.6 million by 2030 [2]. The ECG is a critical physiological recording obtained via electrodes placed on the body surface that measures the heart’s electrical activity. It provides essential diagnostic information for detecting CVDs. ECG signals can reveal symptoms of heart-related pathologies, which are crucial for the prompt diagnosis and effective monitoring of CVDs. They also have the potential to facilitate rapid medical interventions for patients [3]. Therefore, the precise and efficient diagnosis of ECG signals for identifying heart diseases is instrumental in ensuring timely treatment and early intervention, which could dramatically decrease mortality rates associated with CVDs [4].
However, manual ECG diagnosis requires specialized knowledge and significant timenvestment from physicians, consuming substantial medical resources and potentially causing diagnostic backlog. Consequently, there is a growing demand for automated ECG classification technologies to address the increasing burden of CVDs [5]. To accelerate the automation of the ECG diagnostic process for clinical applications, many existing studies employ DL to directly map ECG signals to their corresponding categories [6–8]. In addition, feature engineering which extract significant features from ECG signals could help further improve the classification performance and efficiency of DL-based ECG classification [9]. Among various feature extraction methods, the wavelet transform (WT) is particularly effective for extracting time-frequency information of signals and could reveal frequency variations over time, which is suitable to obtained refined features from non-stationary signals like ECGs [10].
The WT is widely used in DL-based ECG classification [11] as it could utilize wavelet base functions to filter and decompose ECG signals into different sub-bands across various time scales [12]. For instance, [13] utilizes continuous wavelet transform (CWT) to convert ECG signals into the time-frequency domain and employ convolutional neural networks (CNNs) to extract features from the time-frequency maps. It achieved an improvement in the F1 score by 4.75% to 16.85% compared to competing methods without WD for arrhythmia classification. In [14], 24 wavelet features are severed as the input to a multi-layer perceptron (MLP) neural network, and a classification accuracy of 96.5% can be achieved for arrhythmia detection. [15] proposes a novel deep bidirectional LSTM network that takes wavelet sequences at each decomposition level as input features for ECG classification, resulting an accuracy of 99.39% on the MIT-BIH Arrhythmia Database. DL-based ECG classification with wavelet features maintains classification performance while benefiting from lower model complexity, making it practical for applications with limited computational and storage resources [16]. Selecting an appropriate wavelet base function is crucial for accurately capturing the characteristics of time-varying signals, as shown in Fig 1. It can be seen that the detailed time-frequency characteristics captured differ depending on the chosen wavelet base. Hence, the choice of wavelet base function is a critical pre-determined parameter for WT in DL-based ECG classification [9]. Despite the powerful capabilities of WT for feature extraction, the selection of an optimal wavelet base remains a complex and uncertain challenge [10].
To select appropriate wavelet bases, many existing studies pre-select the base based on expert experience [17]. Some studies consider selecting optimal wavelet base based on the correlation or similarity between the wavelet base and the signals to be analyzed. For instance, [18] determine the optimal wavelet base function for ECG signal denoising by calculating the correlation coefficients between the ECG signal and different wavelet base functions. The basis with the highest correlation coefficient is considered as optimal. Recently, selecting the optimal wavelet bases based on the performance of the targeted application becomes popular. [19] proposes a cross-validation approach to select wavelet bases, where the wavelet combination that yields the highest detection performance during the validation is utilized for further analysis. [20] conducts a thorough quantitative analysis to evaluate the denoising performance of 115 potential wavelet base functions (from 6 wavelet families). The optimal wavelet base can be determined based on the signal-to-noise ratio (SNR) after denoising. In[21], the wavelet base which yields the highest denoising performance and arrhythmia classification performance simultaneously is considered as optimal. These wavelet parameter selection methods indeed could find an optimal wavelet base appropriate for the characteristics of most ECG signals, resulting in higher average performance. However, using a single wavelet base for all ECG signals may overlook signals that deviate from the majority, as ECG signals in different categories, especially abnormal ones, exhibit significant variations in the time-frequency domain.
In this study, we focus on a dynamic approach to determine wavelet bases for ECG signals to enhance DL-based ECG diagnosis. By selecting wavelet bases which coincides with the unique characteristics of each ECG signal, we aim to generate wavelet features that are more detailed and distinguishable. Leveraging RL [22], the wavelet base selection (WBS) process is modeled as a stateless Markov Decision Process (MDP) [23]. Here, an RL agent is trained to optimize its action, i.e., selecting wavelet bases, to maximize the reward induced by more appropriate selection of wavelet bases. In our previous study [24], RL was successfully adopted for selection of the optimal parameters for short-time Fourier transform (STFT), which inspires us to apply RL for adaptively select wavelet bases for DL-based ECG classification.
Our main contributions are as follows:
- This is the first study to systematically consider selecting wavelet bases for signals within an RL framework, where an agent is trained to choose wavelet bases to obtain improved wavelet features, thereby immediately enhancing classification performance.
- The wavelet base is customized for each individual ECG signal, allowing for the capture of the most significant wavelet features relevant to its respective category.
- The efficacy of the proposed approach is validated in ECG abnormality classification through the clinically collected PTB-XL ECG database compared to competing methods.
Materials and methods
The overall workflow of the proposed approach is illustrated in Fig 2. First, raw ECG signals undergo preprocessing to standardize the data by scaling inputs and segmenting the signals for label aggregation. The preprocessed signals are then split into training, validation, and test datasets. An RL agent is trained to adaptively select appropriate wavelet bases for the ECG signals in the training dataset. Simultaneously, the agent’s selection strategy is guided through feedback from an evaluation network, which continuously assesses the diagnostic performance obtained with the continuous wavelet transform (CWT) of ECG signals in the evaluation dataset, using the wavelet bases provided by the agent. Finally, the trained RL agent is evaluated on the test dataset to assess the effectiveness of its WBS strategy.
Continuous wavelet transform
In this study, the CWT is employed to extract time-frequency features from ECG signals as it could extract detailed characteristics of ECG signals at adjustable time-frequency resolution. Consider a real-valued signal x ( t ) , the CWT of x ( t ) by using a wavelet base ψ ( t ) can be formulated as
where c and b correspond to the parameters of scale and time shift, respectively, W ( c , b ) represents the wavelet coefficient at scale c and the time shift b, the term describes the wavelet base ψ ( t ) under the translational and scaling transformations. The wavelet coefficient W ( c , b ) essentially quantifies the similarity between signal and the wavelet function
, producing features across multiple temporal regions and frequencies. This enables the capture of time-frequency characteristics in the signal at varying resolutions.
Denote and
as the vectors containing sampled values in the scale and shift domains, respectively. The sampled version of the continuous wavelet transform (CWT) in the time-frequency domain can be represented as
where is the ( i , j ) th element in W. The matrix W represents the discretized CWT results, capturing the time-frequency characteristics of the signal. These features, represented by W, can then be regarded as wavelet-derived features suitable for input into DL models for further analysis and classification.
The wavelet bases exhibit variable time-frequency characteristics, as shown in Fig 1. Commonly used wavelet bases in signal analysis include the Haar wavelet, Daubechies wavelet (dbN), Symlet wavelet (symN), Coiflet wavelet (coifN), and Biorthogonal wavelet (biorNr.Nd) [25]. The Haar wavelet is the simplest to compute but is discontinuous in the time domain. Daubechies wavelets, with their extreme phase and higher vanishing moments, are suitable for reconstructing smooth signals but are more computationally complex and asymmetric. Symlet wavelets improve upon Daubechies wavelets by offering better symmetry and reduced phase distortion. Coiflet wavelets provide high symmetry and effective frequency band partitioning. Biorthogonal wavelets introduce biorthogonality, resolving the conflict between symmetry and precise signal reconstruction.
The wavelet features obtained from the same ECG signal using different wavelet bases through CWT can vary as their time-frequency characteristics change depending on the selected wavelet base, highlighting the importance of choosing the right one. An appropriate wavelet base is crucial for extracting relevant features from ECG signals that are indicative of different diagnostic categories. This study aims to develop a systematic method for selecting the optimal wavelet base for individual ECG signals, enhancing the feature extraction capabilities of CWT and thereby improving the classification accuracy of models in distinguishing between various ECG categories.
RL basics
RL is an approach that involves learning to generate actions to maximize cumulative rewards through interaction with an environment. The structure of RL is illustrated in Fig 3. At its core, an agent interacts with the environment by taking actions based on the current system state and receiving rewards in return. The agent aims to learn the optimal action at each time step to maximize the long-term cumulative reward through continuous learning and policy improvement. The problem addressed by RL can be modeled as a MDP [26], characterized by the tuple { S , A , P , R , γ } , where:
- S is the state space, representing the set of all possible system states.
- A is the action space, representing the set of all possible actions.
- P is the state transition probability, representing the probability distribution of transitioning from one state to another. Specifically,
denotes the probability of transitioning to state
from state s after taking action a.
- r is the reward function, representing the immediate reward received after performing an action in a given state, denoted as R ( s , a ) .
- γ is the discount factor, which determines the present value of future rewards and lies within the interval [ 0 , 1 ] .
In an MDP, the policy π is a mapping function that specifies the action a to be taken in each state s, i.e., π : S → A. The goal of reinforcement learning is to find an optimal policy that maximizes the expected cumulative reward. Specifically, the objective can be expressed as maximizing the following expected discounted sum:
where is the action given by a policy maker π at state
,
denotes the transition probability of state
given the current state
and action
, and γ is the discount factor used to balance short-term and long-term rewards.
In this study, an agent is employed that follows a policy, taking the ECG signal as input and generating an action that selects the optimal wavelet base for the CWT of the ECG signals. the agent continuously refines its selection strategy, improving the accuracy of the ECG classification task over time.
RL-based WBS
To address the problem of selecting the appropriate wavelet base for ECG diagnosis, we propose the RL-based wavelet base selection (RLWBS) framework. This approach systematically determines the optimal wavelet base for each ECG signal, enhancing feature extraction and improving classification performance of ECG signals. The WBS problem is formulated as a MDP within an RL framework, where an RL agent interacts with a specially designed environment to iteratively learn and refine the rationale of WBS based on the classification feedback.
RLWBS framework.
In this study, as a data-driven method, the ECG data to train the policy maker is divided into training and evaluation datasets, i.e., T and V, respectively. Then, the WBS process can be further divided into two stages as illustrated in Fig 4.
In the training stage, as shown in Fig 4, at the beginning of its tth learning iteration, a mini-batch of ECG signals is randomly sampled from the training dataset T and grouped as , i.e.,
where is the ith ECG signal in a mini-batch of B ECG signals in the tth iteration.
Denote the action space where
is the jth candidate action and
is the total number of candidate actions. Let
represent the policy network, which takes an ECG signal x as input and outputs a probability vector
where P ( a | x ) corresponds to the probability distribution for selecting an action a conditioned on x, with indicating the conditional probability of choosing the jth action (i.e., the jth wavelet base) on x. The action for the ECG signal
, denoted as
, is determined by sampling from the probability distribution
, i.e.,
We aggregate all the actions from V as the action for the whole system as
Once (the chosen wavelet base) is selected, the wavelet features for the ECG signal
are extracted using the CWT with the selected wavelet base, denoted as
. The resulting wavelet features,
, along with their respective categories, are used to train a backbone neural network f. This trained network
in the tth iteration serves as the evaluation network, assessing classification accuracy and generating a reward. This reward evaluates the effectiveness in WBS of the policy network, guiding further refinement of its strategy.
In the evaluation stage, as illustrated in Fig 4, each ECG signal in the evaluation dataset, i.e., , is input to the policy network
to obtain its corresponding action selection probabilities
. Unlike in the training stage, the action for each signal is selected with the highest output probability from the policy network, i.e.,
where is the selected action, i.e., the wavelet base index, for the signal
. The wavelet features
for the ECG signals in the evaluation dataset are then obtained based on the selected actions
where | V | denotes the size of the evaluation dataset. The backbone model,
, trained in the previous stage, serves as the evaluation network, providing classification performance using the wavelet features
. The inference accuracy on the evaluation dataset, denoted as
, reflects the effectiveness of the wavelet features
generated by the policy network
in training the backbone network f. Hence, Thus, the system state in this study is defined as
Finally, reward in the tth iteration can be described as
Here, is influenced by the performance of the evaluation network
, which is trained on
with wavelet bases selected by
, and further guided by the classification accuracy calculated on the evaluation dataset.
The policy network then adjusts its weights θ to generate improved actions, aiming to maximize the expected reward. This optimization is achieved using the policy gradient method [27], which allows the network to refine its WBS strategy over time. The detailed process of this adjustment using policy gradients will be described in the following subsection.
In this study, the candidate wavelet bases include the Haar, Daubechies, Biorthogonal, Coiflets, and Symlets wavelet families, as shown in Table 1. These wavelet families are selected based on a comprehensive evaluation of their properties, including compact support, orthogonality, and vanishing moments, which are crucial for effective feature extraction and classification performance [28]. Hence, the action set for each ECG signal can be defined as , where i corresponds to the index of the wavelet base listed in Table 1. Each index represents a specific wavelet base that the RL agent can select for feature extraction.
Update of policy network.
In this study, the policy gradient (PG) algorithm [27] is employed to optimize the policy, aiming to maximize the probability of selecting the optimal action given a state [29]. We define the policy as:
This represents the joint distribution of action selection for all ECG signals in the state , where a contains the corresponding actions for the signals in the state. Each
is the probability of selecting action a (i.e., the wavelet base) for the ith ECG signal
within the mini-batch.
According to the PG algorithm, the gradient of the sampled version of the expected cumulative reward with respect to the network weights θ based on collected state-action pairs and rewards can be calculated as
where represents the gradient of the log-probability of selecting the action
given state
under the current policy network
. According to (5), (6), and (11), the gradient ∇ J ( θ ) can be further expressed as
Finally, based on the gradient descent theorem, the learnable weights of the policy network θ can be updated by performing gradient ascent:
where α is the learning rate for the weights of the policy network.
The detailed design of the policy network is shown in Fig 5. The input signal first passes through two modules consecutively, each containing Conv2D, ReLU, and BN operations, followed by MaxPooling2D and Dropout. The corresponding output from the two modules is then flattened and passed through a linear layer with a ReLU activation function. Finally, two branches, each including a linear layer and a softmax layer, are utilized to generate the two elements in the action individually. The final output represents the probability distributions of selecting the wavelet base.
By continuously updating the policy network using the PG algorithm, the model improves its ability to select the most effective wavelet bases for the corresponding ECG signals to exhibit better features for ECG classification.
Evaluation network.
To evaluate the effectiveness of the actions generated by the policy network and guide its learning of more appropriate wavelet bases, a deep neural network f is employed as the critic within the RL framework [30].
During the tth iteration of the training stage, the network f is trained as the backbone using the wavelet features . The training continues until the backbone achieves a sufficient level of accuracy, at which point its generalization capability reflects the appropriateness of the actions
generated by the policy network
.
To further assess the actions chosen by the policy network, the prediction accuracy
is measured by inputting the wavelet features
from the evaluation dataset into the trained backbone network f. These wavelet features are derived from the wavelet bases corresponding to the actions
selected by the policy network. This evaluation effectively quantifies the capability of WBS mechanism of the policy network.
The accuracy serves as the reward, providing a direct assessment of the effectiveness of the wavelet bases
selected by the policy network. At the end of each learning iteration, the weights of the backbone network f are reset to their initial values, ensuring that training begins from a fresh state at the start of each iteration.
Algorithm 1: RLWBS in the tth learning iteration
1: Sample a mini-batch of ECG signals from the training dataset, denoted as
2: Select actions for each ECG signal in
using the policy network
, i.e.,
3: Perform CWT to extract wavelet features from the signals in
based on the selected actions
4: Train the backbone network f using the wavelet features and their corresponding labels
5: Apply the policy network to the evaluation dataset, obtain the wavelet bases for the ECG signals, and compute their corresponding wavelet features
6: Evaluate the prediction accuracy on the evaluation dataset using the trained network f, and use it as the reward
7: Update the policy network by performing gradient ascent to adjust its parameters θ, using Eqs. (13) and (14)
8: Reset the backbone network f to its initial state
Algorithm 1 describes the proposed RLWBS framework in the tth learning iteration. The process continues until the inference accuracy on the evaluation dataset shows only minor variations. At this point, the entire process terminates, and the policy network can be applied to ECG signals for the targeted applications.
Performance metrics
The classification performance of the proposed approach is evaluated using the metrics of precision, recall, sensitivity, specificity, Area Under the Curve (AUC), F1, and Matthews Correlation Coefficient (MCC), which are defined as:
where TP, TN, FP and FN represent the true positive , true negative, false positive, and false negative predicted values, respectively.
Additionally, to calculate the macro-averaged metrics, the corresponding metrics are first computed individually for each category. These values are then averaged, assigning equal weight to each category irrespective of its sample size, to derive the final macro-averaged metrics.
Results and discussion
In this section, we assess the effectiveness of the proposed RLWBS framework for autonomously selecting optimal wavelet bases. The proposed method is implemented in Python using the PyTorch framework. For this evaluation, we perform experiments on the publicly available PTB-XL ECG database [31]. Initially released in 2020, the dataset includes 21,799 clinical 12-lead ECG recordings from 18,869 patients, each lasting 10 seconds. This study focuses on multi-label classification across five superclass categories: normal ECG (NORM), conduction disturbance (CD), hypertrophy (HYP), myocardial infarction (MI), and ST/T change (STTC). Given that a single ECG can carry multiple labels, this creates a multi-label classification scenario. The PTB-XL dataset adheres to the inter-patient paradigm [32], ensuring that records from the same patient appear exclusively in either the training or test sets, with no overlap.
For baseline comparisons, we include a commonly used approach that selects the wavelet base achieving the highest classification accuracy in N-fold cross-validation, as in [33] (referred to here as CV-WBS). Additionally, we evaluate an energy and Shannon entropy-based method (referred to as EE-WBS) as used in [34], which selects the optimal wavelet base based on the energy-to-Shannon entropy ratio.
Performance improvement with RLWBS
We first evaluate the proposed RLWBS framework on five DL models used as benchmarkfor ECG classification on the PTB-XL dataset as listed in [35], i.e., XResNet [36], Inception [37], ResNet [38], LSTM [39], and LSTM-bidir [39]. Additionally, we include the state-of-the-art model LDM-XResNet, which has demonstrated the highest classification performance in [40]. Originally developed for 1D ECG inputs, these models have been adapted to process 2D wavelet features in this study, including modifications such as substituting 1D convolutional layers with 2D convolutional layers to enable compatibility with 2D feature inputs.
Table 2 presents the macro-AUC and macro-F1 scores of the tested models on the PTB-XL dataset, comparing results across different wavelet selection methods. Models paired with the proposed RLWBS method consistently achieved higher macro-AUC and macro-F1 scores than those using CV-WBS and EE-WBS, demonstrating the efficacy of the proposed RLWBS. This result suggests that RLWBS generates more informative wavelet features, enhancing the ability of model to differentiate between categories. In subsequent analyses, LDM-XResNet is selected as the classifier to further evaluate different WBS approaches.
Fig 6 presents various performance curves, including precision-recall, ROC, and calibration curves with the three WBS methods. Using the identical state-of-the-art (SOTA) classifier, the precision-recall and ROC curves achieved with RLWBS consistently outperform those achieved with CV-WBS and EE-WBS, demonstrating superior performance with RLWBS. Furthermore, as observed from the calibration curves, all three classifiers with the respective WBS mechanisms exhibit a tendency to be overconfident in their predictions, where the predicted probabilities are higher than the actual probabilities. Nevertheless, the calibration curve obtained with RLWBS aligns more closely with the ideal calibration curve compared to the others, indicating higher effectiveness and reliability of the proposed method.
(a), (b), and (c) are curves of precision-recall, ROC, and calibration, respectively.
These outcomes highlight the advantage of adaptive wavelet selection in our RLWBS method over static approaches. The CV-WBS method, which selects a single wavelet base that performs best on average, may overlook unique features in subsets of ECG signals that deviate from the majority, resulting in suboptimal performance for some cases. The EE-WBS approach, which selects a wavelet base based on morphological similarity to ECG signals, considers only one aspect of wavelet characteristics, potentially missing other factors such as support and vanishing moments. In contrast, the RLWBS framework enables the policy network to iteratively optimize WBS by maximizing classification accuracy through reward-based feedback from previous iterations. This continuous learning process enhances the capability of the network to select increasingly effective wavelet bases, while the adaptive nature of RLWBS ensures that extracted features align closely with the specific characteristics of each ECG signal. This adaptability could contribute to more effective feature extraction and improved classification performance in ECG diagnosis.
Additionally, Table 3 provides a detailed comparison of precision, recall, specificity, F1, and MCC scores achieved by LDM-XResNet using different WBS methods across all the diagnostic categories. The proposed RLWBS method consistently achieves higher precision and recall, indicating its stronger ability in identifying both positive and negative cases, thereby reducing both missed detections and false alarms. Hence, the proposed RLWBS could strike a better balance between false positives and false negatives, resulting in improved F1 scores. Moreover, the proposed method demonstrates superior performance in specificity and MCC, which indicates a lower misdiagnosis rate and a higher correlation between the predictions and the ground truth. These results further validate the efficacy of the proposed method in selecting the more appropriate wavelet base, ensuring more accurate and reliable ECG classification.
Furthermore, we compare the SOTA model, LDM-ResNet1d, which originally uses 1D ECG signals, to its modified version adapted for wavelet feature input, i.e., RLWBS+LDM-XResNet2d in Table 4. When combined with RLWBS, the modified LDM-ResNet model achieves even higher macro-F1 scores. This improvement further demonstrates the efficacy of the proposed RLWBS framework in enhancing the inference capacity of DL models for ECG diagnosis.
Comparison of extracted wavelet features
To assess the effectiveness of wavelet bases selected by the RLWBS framework for feature extraction, we present examples of wavelet features generated by each of the three WBS methods in Fig 7. These scalograms, produced through CWT of the same ECG signal segments, demonstrate that the time-frequency information obtained via the RLWBS framework captures finer detail and variation than the other methods, particularly in areas with rapid and complex frequency changes. Hence, it indicates that the RLWBS method provides greater detail in the time-frequency representation, effectively capturing the diversity of frequency components.
(a), (b), (c), and (d) focus on the same time-frequency regions, respectively.
Unlike many existing approaches [19, 33, 41, 42], which predetermine wavelet bases during the preparation stage, the RLWBS framework adaptively selects wavelet bases, providing a higher degree of freedom in capturing time-frequency features. By adjusting wavelet wavelet base to align with the unique characteristics of different signals, RLWBS produces a clearer, more accurate depiction of signal time-frequency dynamics. This adaptability yields more detailed and relevant time-frequency information according to categories, which is particularly advantageous for analyzing complex signals, as it captures subtle variations and potential anomalies more effectively.
Analysis of selected wavelet bases by RLWBS
Table 5 presents the distribution of wavelet base families selected by both the EE-WBS and RLWBS methods. Notably, neither method selects the Haar wavelet, as its discontinuous nature makes it unsuitable for capturing ECG signal features. Additionally, both methods share a similar distribution pattern, with the db wavelet family chosen most frequently, followed by the sym, bior, and coif wavelet families. This pattern reflects the significance of morphological similarity between the ECG signal and wavelet bases, which is the core idea followed by EE-WBS. However, while both approaches emphasize similarity, RLWBS seems to extend beyond this criterion by integrating additional factors influencing wavelet selection compared to EE-WBS. By training a policy network guided by classification performance, RLWBS adapts dynamically, optimizing wavelet selection not only based on similarity but also on other critical factors that enhance feature extraction. This adaptability results in a more refined and effective wavelet-based feature extraction process, particularly valuable for ECG diagnosis.
Fig 8 illustrates two examples of the action probability distributions generated by the policy network for two distinct ECG signals. The high certainty in selecting specific actions upon training completion demonstrates the convergence of action learning, as the policy network identifies a wavelet base for each input signal with high confidence. It suggests that the network successfully detects unique patterns or features within each signal, enabling further decision-making accordingly. Furthermore, the adaptivity of the policy network can be observed as it dynamically selects wavelet bases tailored to different ECG signals. This adaptive approach aims to optimize wavelet selection on a per-signal basis, ultimately enhancing classification performance by aligning the wavelet base more closely to the characteristics of each signal.
Model interpretability through GradCAM
Fig 9 illustrates the focus of the model trained with RLWBS on the ECG signals, represented by the blue curves, while the red curves depict the attention values generated by the classifier. For abnormal categories, the model assigns high attention values to the abnormal ECG regions, effectively highlighting diagnostically relevant segments. Additionally, critical regions of normal ECG signals are carefully examined to confirm the absence of abnormal features. This highlights the capability of the classifier with RLWBS to capture distinctive features pertinent to cardiac diseases, aligning with clinical considerations and providing valuable support to healthcare practitioners in diagnosis.
Limitations and future works
Compared to traditional WBS approaches, such as EE-WBS, which determine the appropriate wavelet base solely by measuring the correlation between ECG signals and wavelet bases, the proposed method requires a training dataset with fully annotated labels. These labels provide performance guidance, enabling the policy network to refine its policy generation strategy. Additionally, the efficacy of the proposed method may degrade with smaller training datasets, as training of the policy maker for WBS heavily rely on the amount of data available for training. This dependency on large volumes of annotated data could be a limitation in scenarios where data is limited, such as when data is streaming or labeled datasets are scarce or expensive to obtain.
A potential solution to these limitations could be transfer learning [43]. Specifically, the knowledge gained from an agent trained on one dataset could be transferred to other ECG signals or different application domains with distinct classification categories, thus enabling broader applicability. To tackle the challenge of unlabeled datasets, unsupervised learning techniques [44] could be employed to adjust the policy maker online, even with limited data annotations. This would help reduce the reliance of training policy maker on fully labeled datasets and extend the usability of the proposed method in real-world scenarios where labels are not readily available. Additionally, in situations with limited data samples, few-shot learning [45] could be utilized to enable the trained agent to adapt its action generation with minimal training samples, effectively mitigating data scarcity and further enhancing the adaptability of the proposed approach.
Conclusion
This study introduces an RL-based WBS mechanism designed to enhance classification performance in DL-based ECG diagnosis. The approach enables an RL agent to dynamically select the optimal wavelet base for each ECG signal, matching its unique characteristics and thus providing more informative wavelet features for improved classification performance. Specifically, the WBS task is framed as an MDP and solved through a PG method, with particularly designed configurations for state, action, and reward. Performance evaluation with the PTB-XL dataset shows that, unlike traditional methods, which often rely on static wavelet bases selected through trial-and-error or similarity to ECG signal morphology, the proposed RLWBS framework achieves finer ECG feature capture, yielding higher level of classification outcomes. It highlights the potential of RL-driven wavelet selection to advance DL-based ECG diagnostic models.
References
- 1. Tsao CW, Aday AW, Almarzooq ZI, Alonso A, Beaton AZ, Bittencourt MS. Heart disease and stroke statistics—2022 update: a report from the American Heart Association. Circulation. 2022;145(8):e153–639.
- 2. Benjamin E, Muntner P, Alonso A, Bittencourt M, Callaway C, Carson A. Heart disease and stroke statistics—2019 update: a report from the American Heart Association. Circulation. 2019;139(10):e56–528.
- 3. Wang F, Casalino LP, Khullar D. Deep learning in medicine-promise, progress, and challenges. JAMA Intern Med 2019;179(3):293–4. pmid:30556825
- 4. Hussein AF, Hashim SJ, Rokhani FZ, Wan Adnan WA. An automated high-accuracy detection scheme for myocardial ischemia based on multi-lead long-interval ECG and Choi-Williams time-frequency analysis incorporating a multi-class SVM classifier. Sensors (Basel) 2021;21(7):2311. pmid:33810211
- 5. Siontis KC, Noseworthy PA, Attia ZI, Friedman PA. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management. Nat Rev Cardiol 2021;18(7):465–78. pmid:33526938
- 6. Pyakillya B, Kazachenko N, Mikhailovsky N. Deep learning for ECG classification. J Phys: Conf Ser. 2017;913:012004.
- 7. Sengupta PP, Kulkarni H, Narula J. Prediction of abnormal myocardial relaxation from signal processed surface ECG. J Am Coll Cardiol 2018;71(15):1650–60. pmid:29650121
- 8. Călburean P-A, Pannone L, Monaco C, Rocca DD, Sorgente A, Almorad A, et al. Predicting and recognizing drug-induced type i brugada pattern using ECG-based deep learning. J Am Heart Assoc 2024;13(10):e033148. pmid:38726893
- 9. Mar T, Zaunseder S, Mart�nez JP, Llamedo M, Poll R. Optimization of ECG classification by means of feature selection. IEEE Trans Biomed Eng. 2011;58(8):10.1109/TBME.2011.2113395. pmid:21317067
- 10. Rhif M, Ben Abbes A, Farah IR, Mart�nez B, Sang Y. Wavelet transform application for/in non-stationary time-series analysis: a review. Appl Sci 2019;9(7):1345.
- 11. Khorrami H, Moavenian M. A comparative study of DWT, CWT and DCT transformations in ECG arrhythmias classification. Expert Systems with Applications 2010;37(8):5751–7.
- 12.
Zhang D, Zhang D. Wavelet transform. Fundamentals of Image Data Mining: Analysis, Features, Classification and Retrieval. 201935–44.
- 13. Wang T, Lu C, Sun Y, Yang M, Liu C, Ou C. Automatic ECG classification using continuous wavelet transform and convolutional neural network. Entropy (Basel) 2021;23(1):119. pmid:33477566
- 14. Sarkaleh MK. Classification of ECG arrhythmias using discrete wavelet transform and neural networks. IJCSEA 2012;2(1):1–13.
- 15. Yildirim �. A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification. Comput Biol Med. 2018;96:189–202. pmid:29614430
- 16. Mazomenos EB, Biswas D, Acharyya A, Chen T, Maharatna K, Rosengarten J, et al. A low-complexity ECG feature extraction algorithm for mobile healthcare applications. IEEE J Biomed Health Inform 2013;17(2):459–69. pmid:23362250
- 17. Addison PS. Wavelet transforms and the ECG: a review. Physiol Meas. 2005;26(5):R155-99. pmid:16088052
- 18. Singh BN, Tiwari AK. Optimal selection of wavelet basis function applied to ECG signal denoising. Digital Signal Process 2006;16(3):275–87.
- 19. Chen D, Wan S, Xiang J, Bao FS. A high-performance seizure detection algorithm based on Discrete Wavelet Transform (DWT) and EEG. PLoS One 2017;12(3):e0173138. pmid:28278203
- 20. Jang YI, Sim JY, Yang J-R, Kwon NK. The optimal selection of mother wavelet function and decomposition level for denoising of DCG signal. Sensors (Basel) 2021;21(5):1851. pmid:33800862
- 21.
Saxena MS, Vijay R, Pahadiya MP, Gupta MKK. Selection of wavelet basis function in denoising of ECG arrhythmias using artificial neural network. Design Engineering. 20211850–63.
- 22. Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: A survey. J Artif Intell Res. 1996;4:237–85.
- 23. Shani G, Heckerman D, Brafman RI, Boutilier C. An MDP-based recommender system. J Machine Learn Res. 2005;6(9).
- 24.
Zhao W, Wang C, Jiang Y, Lin W. Adaptive short-time Fourier transform based on reinforcement learning. 2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE). 2023733–6. https://doi.org/10.1109/iccece58074.2023.10135451
- 25. Rafiee J, Rafiee MA, Prause N, Schoen MP. Wavelet basis functions in biomedical signal processing. Expert Syst Appl 2011;38(5):6190–201.
- 26. Sutton RS, Precup D, Singh S. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artif Intell. 1999;112(1–2):181–211.
- 27.
Silver D, Lever G, Heess N, Degris T, Wierstra D, Riedmiller M. Deterministic policy gradient algorithms. Proceedings of the International Conference on Machine Learning. 2014:387–95.
- 28.
Pathak RS. The wavelet transform. Springer Science & Business Media. 2009;4.
- 29. Agarwal A, Kakade S, Lee J, Mahajan G. On the theory of policy gradient methods: optimality, approximation, and distribution shift. J Machine Learn Res. 2021;22(98):1–76.
- 30. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–44.
- 31. Wagner P, Strodthoff N, Bousseljot R-D, Kreiseler D, Lunze FI, Samek W, et al. PTB-XL, a large publicly available electrocardiography dataset. Sci Data 2020;7(1):154. pmid:32451379
- 32. Xiao Q, Lee K, Mokhtar SA, Ismail I, Pauzi AL bin M, Zhang Q, et al. Deep learning-based ECG arrhythmia classification: a systematic review. Appl Sci 2023;13(8):4964.
- 33. Mandala S, Pratiwi Wibowo AR, Adiwijaya S, Zahid MSM, Rizal A. The effects of Daubechies wavelet basis function (DWBF) and decomposition level on the performance of artificial intelligence-based atrial fibrillation (AF) detection based on electrocardiogram (ECG) signals. Appl Sci 2023;13(5):3036.
- 34. Phadikar S, Sinha N, Ghosh R. Automatic Eyeblink Artifact Removal From EEG Signal Using Wavelet Transform With Heuristically Optimized Threshold. IEEE J Biomed Health Inform 2021;25(2):475–84. pmid:32750902
- 35. Strodthoff N, Wagner P, Schaeffter T, Samek W. Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL. IEEE J Biomed Health Inform 2021;25(5):1519–28. pmid:32903191
- 36.
He K, Zhang Z, Zhang H, Zhang Z, Xie J, Li M. Bag of tricks for image classification with convolutional neural networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:558–67.
- 37. Ismail Fawaz H, Lucas B, Forestier G, Pelletier C, Schmidt DF, Weber J, et al. InceptionTime: Finding AlexNet for time series classification. Data Min Knowl Disc 2020;34(6):1936–62.
- 38.
Wang Z, Yan W, Oates T. Time series classification from scratch with deep neural networks: A strong baseline. 2017 International Joint Conference on Neural Networks (IJCNN). 2017:1578–85. https://doi.org/10.1109/ijcnn.2017.7966039
- 39. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997;9(8):1735–80. pmid:9377276
- 40.
Zhang S, Li Y, Wang X, Gao H, Li J, Liu C. Label decoupling strategy for 12-lead ECG classification. Knowledge-Based Syst. 2023;263110298. https://doi.org/10.1016/j.knosys.2023.110298
- 41.
Qi P, Jovanovic S, Lezama J, Schweitzer P. Discrete wavelet transform optimal parameters estimation for arc fault detection in low-voltage residential power networks. Electric Power Syst Rese. 2017;143130–9. https://doi.org/10.1016/j.epsr.2016.10.008
- 42. Sang Y-F, Wang D, Wu J-C, Zhu Q-P, Wang L. Entropy-based wavelet de-noising method for time series analysis. Entropy 2009;11(4):1123–47.
- 43. Weiss K, Khoshgoftaar TM, Wang D. A survey of transfer learning. J Big Data. 2016;3(1).
- 44.
Peng P, Xiang T, Wang Y, Pontil M, Gong S, Huang T, et al. Unsupervised cross-dataset transfer learning for person re-identification. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 20161306–15. https://doi.org/10.1109/cvpr.2016.146
- 45. Wang Y, Yao Q, Kwok JT, Ni LM. Generalizing from A FEW EXAMPLes. ACM Comput Surv 2020;53(3):1–34.