Adaptive wavelet base selection for deep learning-based ECG diagnosis: A reinforcement learning approach

Qiao Xiao; Chaofeng Wang

doi:10.1371/journal.pone.0318070

Abstract

Electrocardiogram (ECG) signals are crucial in diagnosing cardiovascular diseases (CVDs). While wavelet-based feature extraction has demonstrated effectiveness in deep learning (DL)-based ECG diagnosis, selecting the optimal wavelet base poses a significant challenge, as it directly influences feature quality and diagnostic accuracy. Traditional methods typically rely on fixed wavelet bases chosen heuristically or through trial-and-error, which can fail to cover the distinct characteristics of individual ECG signals, leading to suboptimal performance. To address this limitation, we propose a reinforcement learning-based wavelet base selection (RLWBS) framework that dynamically customizes the wavelet base for each ECG signal. In this framework, a reinforcement learning (RL) agent iteratively optimizes its wavelet base selection (WBS) strategy based on successive feedback of classification performance, aiming to achieve progressively optimized feature extraction. Experiments conducted on the clinically collected PTB-XL dataset for ECG abnormality classification show that the proposed RLWBS framework could obtain more detailed time-frequency representation of ECG signals, yielding enhanced diagnostic performance compared to traditional WBS approaches.

Citation: Xiao Q, Wang C (2025) Adaptive wavelet base selection for deep learning-based ECG diagnosis: A reinforcement learning approach. PLoS ONE 20(2): e0318070. https://doi.org/10.1371/journal.pone.0318070

Editor: Paul-Adrian Calburean, George Emil Palade University of Medicine PharmacyScience and Technology of Targu Mures: Universitatea de Medicina FarmacieStiinte si Tehnologie George Emil Palade din Targu Mures, ROMANIA

Received: November 7, 2024; Accepted: January 9, 2025; Published: February 3, 2025

Copyright: © 2025 Xiao and Wang. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data are freely available from the PTB-XL database published in PysioNet which is recognized by PLOS as established repository. The URL for the data is: https://physionet.org/content/ptb-xl/1.0.3/ or 10.13026/kfzx-aw45

Funding: This study is supported by Scientific Research Fund of Hunan Provincial Education Department (21C0304). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

According to a report issued by the American Heart Association [1], cardiovascular diseases (CVDs) emerge as the leading cause of mortality worldwide, with the number of individuals affected by CVDs projected to rise to 23.6 million by 2030 [2]. The ECG is a critical physiological recording obtained via electrodes placed on the body surface that measures the heart’s electrical activity. It provides essential diagnostic information for detecting CVDs. ECG signals can reveal symptoms of heart-related pathologies, which are crucial for the prompt diagnosis and effective monitoring of CVDs. They also have the potential to facilitate rapid medical interventions for patients [3]. Therefore, the precise and efficient diagnosis of ECG signals for identifying heart diseases is instrumental in ensuring timely treatment and early intervention, which could dramatically decrease mortality rates associated with CVDs [4].

However, manual ECG diagnosis requires specialized knowledge and significant timenvestment from physicians, consuming substantial medical resources and potentially causing diagnostic backlog. Consequently, there is a growing demand for automated ECG classification technologies to address the increasing burden of CVDs [5]. To accelerate the automation of the ECG diagnostic process for clinical applications, many existing studies employ DL to directly map ECG signals to their corresponding categories [6–8]. In addition, feature engineering which extract significant features from ECG signals could help further improve the classification performance and efficiency of DL-based ECG classification [9]. Among various feature extraction methods, the wavelet transform (WT) is particularly effective for extracting time-frequency information of signals and could reveal frequency variations over time, which is suitable to obtained refined features from non-stationary signals like ECGs [10].

The WT is widely used in DL-based ECG classification [11] as it could utilize wavelet base functions to filter and decompose ECG signals into different sub-bands across various time scales [12]. For instance, [13] utilizes continuous wavelet transform (CWT) to convert ECG signals into the time-frequency domain and employ convolutional neural networks (CNNs) to extract features from the time-frequency maps. It achieved an improvement in the F1 score by 4.75% to 16.85% compared to competing methods without WD for arrhythmia classification. In [14], 24 wavelet features are severed as the input to a multi-layer perceptron (MLP) neural network, and a classification accuracy of 96.5% can be achieved for arrhythmia detection. [15] proposes a novel deep bidirectional LSTM network that takes wavelet sequences at each decomposition level as input features for ECG classification, resulting an accuracy of 99.39% on the MIT-BIH Arrhythmia Database. DL-based ECG classification with wavelet features maintains classification performance while benefiting from lower model complexity, making it practical for applications with limited computational and storage resources [16]. Selecting an appropriate wavelet base function is crucial for accurately capturing the characteristics of time-varying signals, as shown in Fig 1. It can be seen that the detailed time-frequency characteristics captured differ depending on the chosen wavelet base. Hence, the choice of wavelet base function is a critical pre-determined parameter for WT in DL-based ECG classification [9]. Despite the powerful capabilities of WT for feature extraction, the selection of an optimal wavelet base remains a complex and uncertain challenge [10].

Download:

Fig 1. An illustration of different wavelet bases and their corresponding wavelet features obtained from the same ECG signal. (a) Different wavelet bases. (b) Wavelet features obtained with different bases.

https://doi.org/10.1371/journal.pone.0318070.g001

To select appropriate wavelet bases, many existing studies pre-select the base based on expert experience [17]. Some studies consider selecting optimal wavelet base based on the correlation or similarity between the wavelet base and the signals to be analyzed. For instance, [18] determine the optimal wavelet base function for ECG signal denoising by calculating the correlation coefficients between the ECG signal and different wavelet base functions. The basis with the highest correlation coefficient is considered as optimal. Recently, selecting the optimal wavelet bases based on the performance of the targeted application becomes popular. [19] proposes a cross-validation approach to select wavelet bases, where the wavelet combination that yields the highest detection performance during the validation is utilized for further analysis. [20] conducts a thorough quantitative analysis to evaluate the denoising performance of 115 potential wavelet base functions (from 6 wavelet families). The optimal wavelet base can be determined based on the signal-to-noise ratio (SNR) after denoising. In[21], the wavelet base which yields the highest denoising performance and arrhythmia classification performance simultaneously is considered as optimal. These wavelet parameter selection methods indeed could find an optimal wavelet base appropriate for the characteristics of most ECG signals, resulting in higher average performance. However, using a single wavelet base for all ECG signals may overlook signals that deviate from the majority, as ECG signals in different categories, especially abnormal ones, exhibit significant variations in the time-frequency domain.

In this study, we focus on a dynamic approach to determine wavelet bases for ECG signals to enhance DL-based ECG diagnosis. By selecting wavelet bases which coincides with the unique characteristics of each ECG signal, we aim to generate wavelet features that are more detailed and distinguishable. Leveraging RL [22], the wavelet base selection (WBS) process is modeled as a stateless Markov Decision Process (MDP) [23]. Here, an RL agent is trained to optimize its action, i.e., selecting wavelet bases, to maximize the reward induced by more appropriate selection of wavelet bases. In our previous study [24], RL was successfully adopted for selection of the optimal parameters for short-time Fourier transform (STFT), which inspires us to apply RL for adaptively select wavelet bases for DL-based ECG classification.

Our main contributions are as follows:

This is the first study to systematically consider selecting wavelet bases for signals within an RL framework, where an agent is trained to choose wavelet bases to obtain improved wavelet features, thereby immediately enhancing classification performance.
The wavelet base is customized for each individual ECG signal, allowing for the capture of the most significant wavelet features relevant to its respective category.
The efficacy of the proposed approach is validated in ECG abnormality classification through the clinically collected PTB-XL ECG database compared to competing methods.

Materials and methods

The overall workflow of the proposed approach is illustrated in Fig 2. First, raw ECG signals undergo preprocessing to standardize the data by scaling inputs and segmenting the signals for label aggregation. The preprocessed signals are then split into training, validation, and test datasets. An RL agent is trained to adaptively select appropriate wavelet bases for the ECG signals in the training dataset. Simultaneously, the agent’s selection strategy is guided through feedback from an evaluation network, which continuously assesses the diagnostic performance obtained with the continuous wavelet transform (CWT) of ECG signals in the evaluation dataset, using the wavelet bases provided by the agent. Finally, the trained RL agent is evaluated on the test dataset to assess the effectiveness of its WBS strategy.

Download:

Fig 2. The overall workflow of this study.

https://doi.org/10.1371/journal.pone.0318070.g002

Continuous wavelet transform

In this study, the CWT is employed to extract time-frequency features from ECG signals as it could extract detailed characteristics of ECG signals at adjustable time-frequency resolution. Consider a real-valued signal x ( t ) , the CWT of x ( t ) by using a wavelet base ψ ( t ) can be formulated as

(1)

where c and b correspond to the parameters of scale and time shift, respectively, W ( c , b ) represents the wavelet coefficient at scale c and the time shift b, the term describes the wavelet base ψ ( t ) under the translational and scaling transformations. The wavelet coefficient W ( c , b ) essentially quantifies the similarity between signal and the wavelet function , producing features across multiple temporal regions and frequencies. This enables the capture of time-frequency characteristics in the signal at varying resolutions.

Denote and as the vectors containing sampled values in the scale and shift domains, respectively. The sampled version of the continuous wavelet transform (CWT) in the time-frequency domain can be represented as

(2)

where is the ( i , j ) th element in W. The matrix W represents the discretized CWT results, capturing the time-frequency characteristics of the signal. These features, represented by W, can then be regarded as wavelet-derived features suitable for input into DL models for further analysis and classification.

The wavelet bases exhibit variable time-frequency characteristics, as shown in Fig 1. Commonly used wavelet bases in signal analysis include the Haar wavelet, Daubechies wavelet (dbN), Symlet wavelet (symN), Coiflet wavelet (coifN), and Biorthogonal wavelet (biorNr.Nd) [25]. The Haar wavelet is the simplest to compute but is discontinuous in the time domain. Daubechies wavelets, with their extreme phase and higher vanishing moments, are suitable for reconstructing smooth signals but are more computationally complex and asymmetric. Symlet wavelets improve upon Daubechies wavelets by offering better symmetry and reduced phase distortion. Coiflet wavelets provide high symmetry and effective frequency band partitioning. Biorthogonal wavelets introduce biorthogonality, resolving the conflict between symmetry and precise signal reconstruction.

The wavelet features obtained from the same ECG signal using different wavelet bases through CWT can vary as their time-frequency characteristics change depending on the selected wavelet base, highlighting the importance of choosing the right one. An appropriate wavelet base is crucial for extracting relevant features from ECG signals that are indicative of different diagnostic categories. This study aims to develop a systematic method for selecting the optimal wavelet base for individual ECG signals, enhancing the feature extraction capabilities of CWT and thereby improving the classification accuracy of models in distinguishing between various ECG categories.

RL basics

RL is an approach that involves learning to generate actions to maximize cumulative rewards through interaction with an environment. The structure of RL is illustrated in Fig 3. At its core, an agent interacts with the environment by taking actions based on the current system state and receiving rewards in return. The agent aims to learn the optimal action at each time step to maximize the long-term cumulative reward through continuous learning and policy improvement. The problem addressed by RL can be modeled as a MDP [26], characterized by the tuple { S , A , P , R , γ } , where:

Download:

Fig 3. Structure of reinforcement learning.

https://doi.org/10.1371/journal.pone.0318070.g003

S is the state space, representing the set of all possible system states.
A is the action space, representing the set of all possible actions.
P is the state transition probability, representing the probability distribution of transitioning from one state to another. Specifically, denotes the probability of transitioning to state from state s after taking action a.
r is the reward function, representing the immediate reward received after performing an action in a given state, denoted as R ( s , a ) .
γ is the discount factor, which determines the present value of future rewards and lies within the interval [ 0 , 1 ] .

In an MDP, the policy π is a mapping function that specifies the action a to be taken in each state s, i.e., π : S → A. The goal of reinforcement learning is to find an optimal policy that maximizes the expected cumulative reward. Specifically, the objective can be expressed as maximizing the following expected discounted sum:

(3)

where is the action given by a policy maker π at state , denotes the transition probability of state given the current state and action , and γ is the discount factor used to balance short-term and long-term rewards.

In this study, an agent is employed that follows a policy, taking the ECG signal as input and generating an action that selects the optimal wavelet base for the CWT of the ECG signals. the agent continuously refines its selection strategy, improving the accuracy of the ECG classification task over time.

RL-based WBS

To address the problem of selecting the appropriate wavelet base for ECG diagnosis, we propose the RL-based wavelet base selection (RLWBS) framework. This approach systematically determines the optimal wavelet base for each ECG signal, enhancing feature extraction and improving classification performance of ECG signals. The WBS problem is formulated as a MDP within an RL framework, where an RL agent interacts with a specially designed environment to iteratively learn and refine the rationale of WBS based on the classification feedback.

RLWBS framework.

In this study, as a data-driven method, the ECG data to train the policy maker is divided into training and evaluation datasets, i.e., T and V, respectively. Then, the WBS process can be further divided into two stages as illustrated in Fig 4.

Download:

Fig 4. An illustration of the training and evaluation stages. (a) Training stage. (b) Evaluation stage.

https://doi.org/10.1371/journal.pone.0318070.g004

In the training stage, as shown in Fig 4, at the beginning of its tth learning iteration, a mini-batch of ECG signals is randomly sampled from the training dataset T and grouped as , i.e.,

(4)

where is the ith ECG signal in a mini-batch of B ECG signals in the tth iteration.

Denote the action space where is the jth candidate action and is the total number of candidate actions. Let represent the policy network, which takes an ECG signal x as input and outputs a probability vector

(5)

where P ( a | x ) corresponds to the probability distribution for selecting an action a conditioned on x, with indicating the conditional probability of choosing the jth action (i.e., the jth wavelet base) on x. The action for the ECG signal , denoted as , is determined by sampling from the probability distribution , i.e.,

(6)

We aggregate all the actions from V as the action for the whole system as

(7)

Once (the chosen wavelet base) is selected, the wavelet features for the ECG signal are extracted using the CWT with the selected wavelet base, denoted as . The resulting wavelet features, , along with their respective categories, are used to train a backbone neural network f. This trained network in the tth iteration serves as the evaluation network, assessing classification accuracy and generating a reward. This reward evaluates the effectiveness in WBS of the policy network, guiding further refinement of its strategy.

In the evaluation stage, as illustrated in Fig 4, each ECG signal in the evaluation dataset, i.e., , is input to the policy network to obtain its corresponding action selection probabilities . Unlike in the training stage, the action for each signal is selected with the highest output probability from the policy network, i.e.,

(8)

where is the selected action, i.e., the wavelet base index, for the signal . The wavelet features for the ECG signals in the evaluation dataset are then obtained based on the selected actions where | V | denotes the size of the evaluation dataset. The backbone model, , trained in the previous stage, serves as the evaluation network, providing classification performance using the wavelet features . The inference accuracy on the evaluation dataset, denoted as , reflects the effectiveness of the wavelet features generated by the policy network in training the backbone network f. Hence, Thus, the system state in this study is defined as

(9)

Finally, reward in the tth iteration can be described as

(10)

Here, is influenced by the performance of the evaluation network , which is trained on with wavelet bases selected by , and further guided by the classification accuracy calculated on the evaluation dataset.

The policy network then adjusts its weights θ to generate improved actions, aiming to maximize the expected reward. This optimization is achieved using the policy gradient method [27], which allows the network to refine its WBS strategy over time. The detailed process of this adjustment using policy gradients will be described in the following subsection.

In this study, the candidate wavelet bases include the Haar, Daubechies, Biorthogonal, Coiflets, and Symlets wavelet families, as shown in Table 1. These wavelet families are selected based on a comprehensive evaluation of their properties, including compact support, orthogonality, and vanishing moments, which are crucial for effective feature extraction and classification performance [28]. Hence, the action set for each ECG signal can be defined as , where i corresponds to the index of the wavelet base listed in Table 1. Each index represents a specific wavelet base that the RL agent can select for feature extraction.

Download:

Table 1. The candidate wavelet bases that can selected by the RL agent. The wavelet base is indexed, and the number in parentheses is subsequently used to represent the corresponding wavelet base.

https://doi.org/10.1371/journal.pone.0313772.t001

Update of policy network.

In this study, the policy gradient (PG) algorithm [27] is employed to optimize the policy, aiming to maximize the probability of selecting the optimal action given a state [29]. We define the policy as:

(11)

This represents the joint distribution of action selection for all ECG signals in the state , where a contains the corresponding actions for the signals in the state. Each is the probability of selecting action a (i.e., the wavelet base) for the ith ECG signal within the mini-batch.

According to the PG algorithm, the gradient of the sampled version of the expected cumulative reward with respect to the network weights θ based on collected state-action pairs and rewards can be calculated as

(12)

where represents the gradient of the log-probability of selecting the action given state under the current policy network . According to (5), (6), and (11), the gradient ∇ ⁡ J ( θ ) can be further expressed as

(13)

Finally, based on the gradient descent theorem, the learnable weights of the policy network θ can be updated by performing gradient ascent:

(14)

where α is the learning rate for the weights of the policy network.

The detailed design of the policy network is shown in Fig 5. The input signal first passes through two modules consecutively, each containing Conv2D, ReLU, and BN operations, followed by MaxPooling2D and Dropout. The corresponding output from the two modules is then flattened and passed through a linear layer with a ReLU activation function. Finally, two branches, each including a linear layer and a softmax layer, are utilized to generate the two elements in the action individually. The final output represents the probability distributions of selecting the wavelet base.

Download:

Fig 5. Structure of the policy network.

https://doi.org/10.1371/journal.pone.0318070.g005

By continuously updating the policy network using the PG algorithm, the model improves its ability to select the most effective wavelet bases for the corresponding ECG signals to exhibit better features for ECG classification.

Evaluation network.

To evaluate the effectiveness of the actions generated by the policy network and guide its learning of more appropriate wavelet bases, a deep neural network f is employed as the critic within the RL framework [30].

During the tth iteration of the training stage, the network f is trained as the backbone using the wavelet features . The training continues until the backbone achieves a sufficient level of accuracy, at which point its generalization capability reflects the appropriateness of the actions generated by the policy network .

To further assess the actions chosen by the policy network, the prediction accuracy is measured by inputting the wavelet features from the evaluation dataset into the trained backbone network f. These wavelet features are derived from the wavelet bases corresponding to the actions selected by the policy network. This evaluation effectively quantifies the capability of WBS mechanism of the policy network.

The accuracy serves as the reward, providing a direct assessment of the effectiveness of the wavelet bases selected by the policy network. At the end of each learning iteration, the weights of the backbone network f are reset to their initial values, ensuring that training begins from a fresh state at the start of each iteration.

Algorithm 1: RLWBS in the tth learning iteration

1: Sample a mini-batch of ECG signals from the training dataset, denoted as

2: Select actions for each ECG signal in using the policy network , i.e.,

3: Perform CWT to extract wavelet features from the signals in based on the selected actions

4: Train the backbone network f using the wavelet features and their corresponding labels

5: Apply the policy network to the evaluation dataset, obtain the wavelet bases for the ECG signals, and compute their corresponding wavelet features

6: Evaluate the prediction accuracy on the evaluation dataset using the trained network f, and use it as the reward

7: Update the policy network by performing gradient ascent to adjust its parameters θ, using Eqs. (13) and (14)

8: Reset the backbone network f to its initial state

Algorithm 1 describes the proposed RLWBS framework in the tth learning iteration. The process continues until the inference accuracy on the evaluation dataset shows only minor variations. At this point, the entire process terminates, and the policy network can be applied to ECG signals for the targeted applications.

Performance metrics

The classification performance of the proposed approach is evaluated using the metrics of precision, recall, sensitivity, specificity, Area Under the Curve (AUC), F1, and Matthews Correlation Coefficient (MCC), which are defined as:

(15)

(16)

(17)

(18)

(19)

(20)

(21)

where TP, TN, FP and FN represent the true positive , true negative, false positive, and false negative predicted values, respectively.

Additionally, to calculate the macro-averaged metrics, the corresponding metrics are first computed individually for each category. These values are then averaged, assigning equal weight to each category irrespective of its sample size, to derive the final macro-averaged metrics.

Results and discussion

In this section, we assess the effectiveness of the proposed RLWBS framework for autonomously selecting optimal wavelet bases. The proposed method is implemented in Python using the PyTorch framework. For this evaluation, we perform experiments on the publicly available PTB-XL ECG database [31]. Initially released in 2020, the dataset includes 21,799 clinical 12-lead ECG recordings from 18,869 patients, each lasting 10 seconds. This study focuses on multi-label classification across five superclass categories: normal ECG (NORM), conduction disturbance (CD), hypertrophy (HYP), myocardial infarction (MI), and ST/T change (STTC). Given that a single ECG can carry multiple labels, this creates a multi-label classification scenario. The PTB-XL dataset adheres to the inter-patient paradigm [32], ensuring that records from the same patient appear exclusively in either the training or test sets, with no overlap.

For baseline comparisons, we include a commonly used approach that selects the wavelet base achieving the highest classification accuracy in N-fold cross-validation, as in [33] (referred to here as CV-WBS). Additionally, we evaluate an energy and Shannon entropy-based method (referred to as EE-WBS) as used in [34], which selects the optimal wavelet base based on the energy-to-Shannon entropy ratio.

Performance improvement with RLWBS

We first evaluate the proposed RLWBS framework on five DL models used as benchmarkfor ECG classification on the PTB-XL dataset as listed in [35], i.e., XResNet [36], Inception [37], ResNet [38], LSTM [39], and LSTM-bidir [39]. Additionally, we include the state-of-the-art model LDM-XResNet, which has demonstrated the highest classification performance in [40]. Originally developed for 1D ECG inputs, these models have been adapted to process 2D wavelet features in this study, including modifications such as substituting 1D convolutional layers with 2D convolutional layers to enable compatibility with 2D feature inputs.

Table 2 presents the macro-AUC and macro-F1 scores of the tested models on the PTB-XL dataset, comparing results across different wavelet selection methods. Models paired with the proposed RLWBS method consistently achieved higher macro-AUC and macro-F1 scores than those using CV-WBS and EE-WBS, demonstrating the efficacy of the proposed RLWBS. This result suggests that RLWBS generates more informative wavelet features, enhancing the ability of model to differentiate between categories. In subsequent analyses, LDM-XResNet is selected as the classifier to further evaluate different WBS approaches.

Download:

Table 2. Performance comparison of various DL models using different WBS methods.

https://doi.org/10.1371/journal.pone.0313772.t002

Fig 6 presents various performance curves, including precision-recall, ROC, and calibration curves with the three WBS methods. Using the identical state-of-the-art (SOTA) classifier, the precision-recall and ROC curves achieved with RLWBS consistently outperform those achieved with CV-WBS and EE-WBS, demonstrating superior performance with RLWBS. Furthermore, as observed from the calibration curves, all three classifiers with the respective WBS mechanisms exhibit a tendency to be overconfident in their predictions, where the predicted probabilities are higher than the actual probabilities. Nevertheless, the calibration curve obtained with RLWBS aligns more closely with the ideal calibration curve compared to the others, indicating higher effectiveness and reliability of the proposed method.

Download:

Fig 6. Different performance curves.

(a), (b), and (c) are curves of precision-recall, ROC, and calibration, respectively.

https://doi.org/10.1371/journal.pone.0318070.g006

These outcomes highlight the advantage of adaptive wavelet selection in our RLWBS method over static approaches. The CV-WBS method, which selects a single wavelet base that performs best on average, may overlook unique features in subsets of ECG signals that deviate from the majority, resulting in suboptimal performance for some cases. The EE-WBS approach, which selects a wavelet base based on morphological similarity to ECG signals, considers only one aspect of wavelet characteristics, potentially missing other factors such as support and vanishing moments. In contrast, the RLWBS framework enables the policy network to iteratively optimize WBS by maximizing classification accuracy through reward-based feedback from previous iterations. This continuous learning process enhances the capability of the network to select increasingly effective wavelet bases, while the adaptive nature of RLWBS ensures that extracted features align closely with the specific characteristics of each ECG signal. This adaptability could contribute to more effective feature extraction and improved classification performance in ECG diagnosis.

Additionally, Table 3 provides a detailed comparison of precision, recall, specificity, F1, and MCC scores achieved by LDM-XResNet using different WBS methods across all the diagnostic categories. The proposed RLWBS method consistently achieves higher precision and recall, indicating its stronger ability in identifying both positive and negative cases, thereby reducing both missed detections and false alarms. Hence, the proposed RLWBS could strike a better balance between false positives and false negatives, resulting in improved F1 scores. Moreover, the proposed method demonstrates superior performance in specificity and MCC, which indicates a lower misdiagnosis rate and a higher correlation between the predictions and the ground truth. These results further validate the efficacy of the proposed method in selecting the more appropriate wavelet base, ensuring more accurate and reliable ECG classification.

Download:

Table 3. Category-wise performance comparison of different WBS methods.

https://doi.org/10.1371/journal.pone.0313772.t003

Furthermore, we compare the SOTA model, LDM-ResNet1d, which originally uses 1D ECG signals, to its modified version adapted for wavelet feature input, i.e., RLWBS+LDM-XResNet2d in Table 4. When combined with RLWBS, the modified LDM-ResNet model achieves even higher macro-F1 scores. This improvement further demonstrates the efficacy of the proposed RLWBS framework in enhancing the inference capacity of DL models for ECG diagnosis.

Download:

Table 4. Comparison of F1 scores between the original LDM-XResNet1d and the modified LDM-XResNet2d model with RLWBS.

https://doi.org/10.1371/journal.pone.0313772.t004

Comparison of extracted wavelet features

To assess the effectiveness of wavelet bases selected by the RLWBS framework for feature extraction, we present examples of wavelet features generated by each of the three WBS methods in Fig 7. These scalograms, produced through CWT of the same ECG signal segments, demonstrate that the time-frequency information obtained via the RLWBS framework captures finer detail and variation than the other methods, particularly in areas with rapid and complex frequency changes. Hence, it indicates that the RLWBS method provides greater detail in the time-frequency representation, effectively capturing the diversity of frequency components.

Download:

Fig 7. Examples of extracted features generated by different wavelet selection methods across varying time-frequency scales.

(a), (b), (c), and (d) focus on the same time-frequency regions, respectively.

https://doi.org/10.1371/journal.pone.0318070.g007

Unlike many existing approaches [19, 33, 41, 42], which predetermine wavelet bases during the preparation stage, the RLWBS framework adaptively selects wavelet bases, providing a higher degree of freedom in capturing time-frequency features. By adjusting wavelet wavelet base to align with the unique characteristics of different signals, RLWBS produces a clearer, more accurate depiction of signal time-frequency dynamics. This adaptability yields more detailed and relevant time-frequency information according to categories, which is particularly advantageous for analyzing complex signals, as it captures subtle variations and potential anomalies more effectively.

Analysis of selected wavelet bases by RLWBS

Table 5 presents the distribution of wavelet base families selected by both the EE-WBS and RLWBS methods. Notably, neither method selects the Haar wavelet, as its discontinuous nature makes it unsuitable for capturing ECG signal features. Additionally, both methods share a similar distribution pattern, with the db wavelet family chosen most frequently, followed by the sym, bior, and coif wavelet families. This pattern reflects the significance of morphological similarity between the ECG signal and wavelet bases, which is the core idea followed by EE-WBS. However, while both approaches emphasize similarity, RLWBS seems to extend beyond this criterion by integrating additional factors influencing wavelet selection compared to EE-WBS. By training a policy network guided by classification performance, RLWBS adapts dynamically, optimizing wavelet selection not only based on similarity but also on other critical factors that enhance feature extraction. This adaptability results in a more refined and effective wavelet-based feature extraction process, particularly valuable for ECG diagnosis.

Download:

Table 5. Distribution of selected wavelet base families with EE-WBS and RLWBS.

https://doi.org/10.1371/journal.pone.0313772.t005

Fig 8 illustrates two examples of the action probability distributions generated by the policy network for two distinct ECG signals. The high certainty in selecting specific actions upon training completion demonstrates the convergence of action learning, as the policy network identifies a wavelet base for each input signal with high confidence. It suggests that the network successfully detects unique patterns or features within each signal, enabling further decision-making accordingly. Furthermore, the adaptivity of the policy network can be observed as it dynamically selects wavelet bases tailored to different ECG signals. This adaptive approach aims to optimize wavelet selection on a per-signal basis, ultimately enhancing classification performance by aligning the wavelet base more closely to the characteristics of each signal.

Download:

Fig 8. Two examples of the probability distribution output from the policy network.

https://doi.org/10.1371/journal.pone.0318070.g008

Model interpretability through GradCAM

Fig 9 illustrates the focus of the model trained with RLWBS on the ECG signals, represented by the blue curves, while the red curves depict the attention values generated by the classifier. For abnormal categories, the model assigns high attention values to the abnormal ECG regions, effectively highlighting diagnostically relevant segments. Additionally, critical regions of normal ECG signals are carefully examined to confirm the absence of abnormal features. This highlights the capability of the classifier with RLWBS to capture distinctive features pertinent to cardiac diseases, aligning with clinical considerations and providing valuable support to healthcare practitioners in diagnosis.

Download:

Fig 9. GradCAM for different categories.

https://doi.org/10.1371/journal.pone.0318070.g009

Limitations and future works

Compared to traditional WBS approaches, such as EE-WBS, which determine the appropriate wavelet base solely by measuring the correlation between ECG signals and wavelet bases, the proposed method requires a training dataset with fully annotated labels. These labels provide performance guidance, enabling the policy network to refine its policy generation strategy. Additionally, the efficacy of the proposed method may degrade with smaller training datasets, as training of the policy maker for WBS heavily rely on the amount of data available for training. This dependency on large volumes of annotated data could be a limitation in scenarios where data is limited, such as when data is streaming or labeled datasets are scarce or expensive to obtain.

A potential solution to these limitations could be transfer learning [43]. Specifically, the knowledge gained from an agent trained on one dataset could be transferred to other ECG signals or different application domains with distinct classification categories, thus enabling broader applicability. To tackle the challenge of unlabeled datasets, unsupervised learning techniques [44] could be employed to adjust the policy maker online, even with limited data annotations. This would help reduce the reliance of training policy maker on fully labeled datasets and extend the usability of the proposed method in real-world scenarios where labels are not readily available. Additionally, in situations with limited data samples, few-shot learning [45] could be utilized to enable the trained agent to adapt its action generation with minimal training samples, effectively mitigating data scarcity and further enhancing the adaptability of the proposed approach.

Conclusion

This study introduces an RL-based WBS mechanism designed to enhance classification performance in DL-based ECG diagnosis. The approach enables an RL agent to dynamically select the optimal wavelet base for each ECG signal, matching its unique characteristics and thus providing more informative wavelet features for improved classification performance. Specifically, the WBS task is framed as an MDP and solved through a PG method, with particularly designed configurations for state, action, and reward. Performance evaluation with the PTB-XL dataset shows that, unlike traditional methods, which often rely on static wavelet bases selected through trial-and-error or similarity to ECG signal morphology, the proposed RLWBS framework achieves finer ECG feature capture, yielding higher level of classification outcomes. It highlights the potential of RL-driven wavelet selection to advance DL-based ECG diagnostic models.

References

1. Tsao CW, Aday AW, Almarzooq ZI, Alonso A, Beaton AZ, Bittencourt MS. Heart disease and stroke statistics—2022 update: a report from the American Heart Association. Circulation. 2022;145(8):e153–639.
- View Article
- Google Scholar
2. Benjamin E, Muntner P, Alonso A, Bittencourt M, Callaway C, Carson A. Heart disease and stroke statistics—2019 update: a report from the American Heart Association. Circulation. 2019;139(10):e56–528.
- View Article
- Google Scholar
3. Wang F, Casalino LP, Khullar D. Deep learning in medicine-promise, progress, and challenges. JAMA Intern Med 2019;179(3):293–4. pmid:30556825
- View Article
- PubMed/NCBI
- Google Scholar
4. Hussein AF, Hashim SJ, Rokhani FZ, Wan Adnan WA. An automated high-accuracy detection scheme for myocardial ischemia based on multi-lead long-interval ECG and Choi-Williams time-frequency analysis incorporating a multi-class SVM classifier. Sensors (Basel) 2021;21(7):2311. pmid:33810211
- View Article
- PubMed/NCBI
- Google Scholar
5. Siontis KC, Noseworthy PA, Attia ZI, Friedman PA. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management. Nat Rev Cardiol 2021;18(7):465–78. pmid:33526938
- View Article
- PubMed/NCBI
- Google Scholar
6. Pyakillya B, Kazachenko N, Mikhailovsky N. Deep learning for ECG classification. J Phys: Conf Ser. 2017;913:012004.
- View Article
- Google Scholar
7. Sengupta PP, Kulkarni H, Narula J. Prediction of abnormal myocardial relaxation from signal processed surface ECG. J Am Coll Cardiol 2018;71(15):1650–60. pmid:29650121
- View Article
- PubMed/NCBI
- Google Scholar
8. Călburean P-A, Pannone L, Monaco C, Rocca DD, Sorgente A, Almorad A, et al. Predicting and recognizing drug-induced type i brugada pattern using ECG-based deep learning. J Am Heart Assoc 2024;13(10):e033148. pmid:38726893
- View Article
- PubMed/NCBI
- Google Scholar
9. Mar T, Zaunseder S, Mart�nez JP, Llamedo M, Poll R. Optimization of ECG classification by means of feature selection. IEEE Trans Biomed Eng. 2011;58(8):10.1109/TBME.2011.2113395. pmid:21317067
- View Article
- PubMed/NCBI
- Google Scholar
10. Rhif M, Ben Abbes A, Farah IR, Mart�nez B, Sang Y. Wavelet transform application for/in non-stationary time-series analysis: a review. Appl Sci 2019;9(7):1345.
- View Article
- Google Scholar
11. Khorrami H, Moavenian M. A comparative study of DWT, CWT and DCT transformations in ECG arrhythmias classification. Expert Systems with Applications 2010;37(8):5751–7.
- View Article
- Google Scholar
12. Zhang D, Zhang D. Wavelet transform. Fundamentals of Image Data Mining: Analysis, Features, Classification and Retrieval. 201935–44.
13. Wang T, Lu C, Sun Y, Yang M, Liu C, Ou C. Automatic ECG classification using continuous wavelet transform and convolutional neural network. Entropy (Basel) 2021;23(1):119. pmid:33477566
- View Article
- PubMed/NCBI
- Google Scholar
14. Sarkaleh MK. Classification of ECG arrhythmias using discrete wavelet transform and neural networks. IJCSEA 2012;2(1):1–13.
- View Article
- Google Scholar
15. Yildirim �. A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification. Comput Biol Med. 2018;96:189–202. pmid:29614430
- View Article
- PubMed/NCBI
- Google Scholar
16. Mazomenos EB, Biswas D, Acharyya A, Chen T, Maharatna K, Rosengarten J, et al. A low-complexity ECG feature extraction algorithm for mobile healthcare applications. IEEE J Biomed Health Inform 2013;17(2):459–69. pmid:23362250
- View Article
- PubMed/NCBI
- Google Scholar
17. Addison PS. Wavelet transforms and the ECG: a review. Physiol Meas. 2005;26(5):R155-99. pmid:16088052
- View Article
- PubMed/NCBI
- Google Scholar
18. Singh BN, Tiwari AK. Optimal selection of wavelet basis function applied to ECG signal denoising. Digital Signal Process 2006;16(3):275–87.
- View Article
- Google Scholar
19. Chen D, Wan S, Xiang J, Bao FS. A high-performance seizure detection algorithm based on Discrete Wavelet Transform (DWT) and EEG. PLoS One 2017;12(3):e0173138. pmid:28278203
- View Article
- PubMed/NCBI
- Google Scholar
20. Jang YI, Sim JY, Yang J-R, Kwon NK. The optimal selection of mother wavelet function and decomposition level for denoising of DCG signal. Sensors (Basel) 2021;21(5):1851. pmid:33800862
- View Article
- PubMed/NCBI
- Google Scholar
21. Saxena MS, Vijay R, Pahadiya MP, Gupta MKK. Selection of wavelet basis function in denoising of ECG arrhythmias using artificial neural network. Design Engineering. 20211850–63.
22. Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: A survey. J Artif Intell Res. 1996;4:237–85.
- View Article
- Google Scholar
23. Shani G, Heckerman D, Brafman RI, Boutilier C. An MDP-based recommender system. J Machine Learn Res. 2005;6(9).
- View Article
- Google Scholar
24. Zhao W, Wang C, Jiang Y, Lin W. Adaptive short-time Fourier transform based on reinforcement learning. 2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE). 2023733–6. https://doi.org/10.1109/iccece58074.2023.10135451
25. Rafiee J, Rafiee MA, Prause N, Schoen MP. Wavelet basis functions in biomedical signal processing. Expert Syst Appl 2011;38(5):6190–201.
- View Article
- Google Scholar
26. Sutton RS, Precup D, Singh S. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artif Intell. 1999;112(1–2):181–211.
- View Article
- Google Scholar
27. Silver D, Lever G, Heess N, Degris T, Wierstra D, Riedmiller M. Deterministic policy gradient algorithms. Proceedings of the International Conference on Machine Learning. 2014:387–95.
28. Pathak RS. The wavelet transform. Springer Science & Business Media. 2009;4.
29. Agarwal A, Kakade S, Lee J, Mahajan G. On the theory of policy gradient methods: optimality, approximation, and distribution shift. J Machine Learn Res. 2021;22(98):1–76.
- View Article
- Google Scholar
30. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–44.
- View Article
- Google Scholar
31. Wagner P, Strodthoff N, Bousseljot R-D, Kreiseler D, Lunze FI, Samek W, et al. PTB-XL, a large publicly available electrocardiography dataset. Sci Data 2020;7(1):154. pmid:32451379
- View Article
- PubMed/NCBI
- Google Scholar
32. Xiao Q, Lee K, Mokhtar SA, Ismail I, Pauzi AL bin M, Zhang Q, et al. Deep learning-based ECG arrhythmia classification: a systematic review. Appl Sci 2023;13(8):4964.
- View Article
- Google Scholar
33. Mandala S, Pratiwi Wibowo AR, Adiwijaya S, Zahid MSM, Rizal A. The effects of Daubechies wavelet basis function (DWBF) and decomposition level on the performance of artificial intelligence-based atrial fibrillation (AF) detection based on electrocardiogram (ECG) signals. Appl Sci 2023;13(5):3036.
- View Article
- Google Scholar
34. Phadikar S, Sinha N, Ghosh R. Automatic Eyeblink Artifact Removal From EEG Signal Using Wavelet Transform With Heuristically Optimized Threshold. IEEE J Biomed Health Inform 2021;25(2):475–84. pmid:32750902
- View Article
- PubMed/NCBI
- Google Scholar
35. Strodthoff N, Wagner P, Schaeffter T, Samek W. Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL. IEEE J Biomed Health Inform 2021;25(5):1519–28. pmid:32903191
- View Article
- PubMed/NCBI
- Google Scholar
36. He K, Zhang Z, Zhang H, Zhang Z, Xie J, Li M. Bag of tricks for image classification with convolutional neural networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:558–67.
37. Ismail Fawaz H, Lucas B, Forestier G, Pelletier C, Schmidt DF, Weber J, et al. InceptionTime: Finding AlexNet for time series classification. Data Min Knowl Disc 2020;34(6):1936–62.
- View Article
- Google Scholar
38. Wang Z, Yan W, Oates T. Time series classification from scratch with deep neural networks: A strong baseline. 2017 International Joint Conference on Neural Networks (IJCNN). 2017:1578–85. https://doi.org/10.1109/ijcnn.2017.7966039
39. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997;9(8):1735–80. pmid:9377276
- View Article
- PubMed/NCBI
- Google Scholar
40. Zhang S, Li Y, Wang X, Gao H, Li J, Liu C. Label decoupling strategy for 12-lead ECG classification. Knowledge-Based Syst. 2023;263110298. https://doi.org/10.1016/j.knosys.2023.110298
41. Qi P, Jovanovic S, Lezama J, Schweitzer P. Discrete wavelet transform optimal parameters estimation for arc fault detection in low-voltage residential power networks. Electric Power Syst Rese. 2017;143130–9. https://doi.org/10.1016/j.epsr.2016.10.008
42. Sang Y-F, Wang D, Wu J-C, Zhu Q-P, Wang L. Entropy-based wavelet de-noising method for time series analysis. Entropy 2009;11(4):1123–47.
- View Article
- Google Scholar
43. Weiss K, Khoshgoftaar TM, Wang D. A survey of transfer learning. J Big Data. 2016;3(1).
- View Article
- Google Scholar
44. Peng P, Xiang T, Wang Y, Pontil M, Gong S, Huang T, et al. Unsupervised cross-dataset transfer learning for person re-identification. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 20161306–15. https://doi.org/10.1109/cvpr.2016.146
45. Wang Y, Yao Q, Kwok JT, Ni LM. Generalizing from A FEW EXAMPLes. ACM Comput Surv 2020;53(3):1–34.
- View Article
- Google Scholar

[ref1] 1. Tsao CW, Aday AW, Almarzooq ZI, Alonso A, Beaton AZ, Bittencourt MS. Heart disease and stroke statistics—2022 update: a report from the American Heart Association. Circulation. 2022;145(8):e153–639.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Benjamin E, Muntner P, Alonso A, Bittencourt M, Callaway C, Carson A. Heart disease and stroke statistics—2019 update: a report from the American Heart Association. Circulation. 2019;139(10):e56–528.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Wang F, Casalino LP, Khullar D. Deep learning in medicine-promise, progress, and challenges. JAMA Intern Med 2019;179(3):293–4. pmid:30556825
View Article
PubMed/NCBI
Google Scholar

[8] View Article

[9] PubMed/NCBI

[10] Google Scholar

[ref4] 4. Hussein AF, Hashim SJ, Rokhani FZ, Wan Adnan WA. An automated high-accuracy detection scheme for myocardial ischemia based on multi-lead long-interval ECG and Choi-Williams time-frequency analysis incorporating a multi-class SVM classifier. Sensors (Basel) 2021;21(7):2311. pmid:33810211
View Article
PubMed/NCBI
Google Scholar

[12] View Article

[13] PubMed/NCBI

[14] Google Scholar

[ref5] 5. Siontis KC, Noseworthy PA, Attia ZI, Friedman PA. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management. Nat Rev Cardiol 2021;18(7):465–78. pmid:33526938
View Article
PubMed/NCBI
Google Scholar

[16] View Article

[17] PubMed/NCBI

[18] Google Scholar

[ref6] 6. Pyakillya B, Kazachenko N, Mikhailovsky N. Deep learning for ECG classification. J Phys: Conf Ser. 2017;913:012004.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref7] 7. Sengupta PP, Kulkarni H, Narula J. Prediction of abnormal myocardial relaxation from signal processed surface ECG. J Am Coll Cardiol 2018;71(15):1650–60. pmid:29650121
View Article
PubMed/NCBI
Google Scholar

[23] View Article

[24] PubMed/NCBI

[25] Google Scholar

[ref8] 8. Călburean P-A, Pannone L, Monaco C, Rocca DD, Sorgente A, Almorad A, et al. Predicting and recognizing drug-induced type i brugada pattern using ECG-based deep learning. J Am Heart Assoc 2024;13(10):e033148. pmid:38726893
View Article
PubMed/NCBI
Google Scholar

[27] View Article

[28] PubMed/NCBI

[29] Google Scholar

[ref9] 9. Mar T, Zaunseder S, Mart�nez JP, Llamedo M, Poll R. Optimization of ECG classification by means of feature selection. IEEE Trans Biomed Eng. 2011;58(8):10.1109/TBME.2011.2113395. pmid:21317067
View Article
PubMed/NCBI
Google Scholar

[31] View Article

[32] PubMed/NCBI

[33] Google Scholar

[ref10] 10. Rhif M, Ben Abbes A, Farah IR, Mart�nez B, Sang Y. Wavelet transform application for/in non-stationary time-series analysis: a review. Appl Sci 2019;9(7):1345.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref11] 11. Khorrami H, Moavenian M. A comparative study of DWT, CWT and DCT transformations in ECG arrhythmias classification. Expert Systems with Applications 2010;37(8):5751–7.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref12] 12. Zhang D, Zhang D. Wavelet transform. Fundamentals of Image Data Mining: Analysis, Features, Classification and Retrieval. 201935–44.

[ref13] 13. Wang T, Lu C, Sun Y, Yang M, Liu C, Ou C. Automatic ECG classification using continuous wavelet transform and convolutional neural network. Entropy (Basel) 2021;23(1):119. pmid:33477566
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref14] 14. Sarkaleh MK. Classification of ECG arrhythmias using discrete wavelet transform and neural networks. IJCSEA 2012;2(1):1–13.
View Article
Google Scholar

[46] View Article

[47] Google Scholar

[ref15] 15. Yildirim �. A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification. Comput Biol Med. 2018;96:189–202. pmid:29614430
View Article
PubMed/NCBI
Google Scholar

[49] View Article

[50] PubMed/NCBI

[51] Google Scholar

[ref16] 16. Mazomenos EB, Biswas D, Acharyya A, Chen T, Maharatna K, Rosengarten J, et al. A low-complexity ECG feature extraction algorithm for mobile healthcare applications. IEEE J Biomed Health Inform 2013;17(2):459–69. pmid:23362250
View Article
PubMed/NCBI
Google Scholar

[53] View Article

[54] PubMed/NCBI

[55] Google Scholar

[ref17] 17. Addison PS. Wavelet transforms and the ECG: a review. Physiol Meas. 2005;26(5):R155-99. pmid:16088052
View Article
PubMed/NCBI
Google Scholar

[57] View Article

[58] PubMed/NCBI

[59] Google Scholar

[ref18] 18. Singh BN, Tiwari AK. Optimal selection of wavelet basis function applied to ECG signal denoising. Digital Signal Process 2006;16(3):275–87.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref19] 19. Chen D, Wan S, Xiang J, Bao FS. A high-performance seizure detection algorithm based on Discrete Wavelet Transform (DWT) and EEG. PLoS One 2017;12(3):e0173138. pmid:28278203
View Article
PubMed/NCBI
Google Scholar

[64] View Article

[65] PubMed/NCBI

[66] Google Scholar

[ref20] 20. Jang YI, Sim JY, Yang J-R, Kwon NK. The optimal selection of mother wavelet function and decomposition level for denoising of DCG signal. Sensors (Basel) 2021;21(5):1851. pmid:33800862
View Article
PubMed/NCBI
Google Scholar

[68] View Article

[69] PubMed/NCBI

[70] Google Scholar

[ref21] 21. Saxena MS, Vijay R, Pahadiya MP, Gupta MKK. Selection of wavelet basis function in denoising of ECG arrhythmias using artificial neural network. Design Engineering. 20211850–63.

[ref22] 22. Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: A survey. J Artif Intell Res. 1996;4:237–85.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref23] 23. Shani G, Heckerman D, Brafman RI, Boutilier C. An MDP-based recommender system. J Machine Learn Res. 2005;6(9).
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref24] 24. Zhao W, Wang C, Jiang Y, Lin W. Adaptive short-time Fourier transform based on reinforcement learning. 2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE). 2023733–6. https://doi.org/10.1109/iccece58074.2023.10135451

[ref25] 25. Rafiee J, Rafiee MA, Prause N, Schoen MP. Wavelet basis functions in biomedical signal processing. Expert Syst Appl 2011;38(5):6190–201.
View Article
Google Scholar

[80] View Article

[81] Google Scholar

[ref26] 26. Sutton RS, Precup D, Singh S. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artif Intell. 1999;112(1–2):181–211.
View Article
Google Scholar

[83] View Article

[84] Google Scholar

[ref27] 27. Silver D, Lever G, Heess N, Degris T, Wierstra D, Riedmiller M. Deterministic policy gradient algorithms. Proceedings of the International Conference on Machine Learning. 2014:387–95.

[ref28] 28. Pathak RS. The wavelet transform. Springer Science & Business Media. 2009;4.

[ref29] 29. Agarwal A, Kakade S, Lee J, Mahajan G. On the theory of policy gradient methods: optimality, approximation, and distribution shift. J Machine Learn Res. 2021;22(98):1–76.
View Article
Google Scholar

[88] View Article

[89] Google Scholar

[ref30] 30. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–44.
View Article
Google Scholar

[91] View Article

[92] Google Scholar

[ref31] 31. Wagner P, Strodthoff N, Bousseljot R-D, Kreiseler D, Lunze FI, Samek W, et al. PTB-XL, a large publicly available electrocardiography dataset. Sci Data 2020;7(1):154. pmid:32451379
View Article
PubMed/NCBI
Google Scholar

[94] View Article

[95] PubMed/NCBI

[96] Google Scholar

[ref32] 32. Xiao Q, Lee K, Mokhtar SA, Ismail I, Pauzi AL bin M, Zhang Q, et al. Deep learning-based ECG arrhythmia classification: a systematic review. Appl Sci 2023;13(8):4964.
View Article
Google Scholar

[98] View Article

[99] Google Scholar

[ref33] 33. Mandala S, Pratiwi Wibowo AR, Adiwijaya S, Zahid MSM, Rizal A. The effects of Daubechies wavelet basis function (DWBF) and decomposition level on the performance of artificial intelligence-based atrial fibrillation (AF) detection based on electrocardiogram (ECG) signals. Appl Sci 2023;13(5):3036.
View Article
Google Scholar

[101] View Article

[102] Google Scholar

[ref34] 34. Phadikar S, Sinha N, Ghosh R. Automatic Eyeblink Artifact Removal From EEG Signal Using Wavelet Transform With Heuristically Optimized Threshold. IEEE J Biomed Health Inform 2021;25(2):475–84. pmid:32750902
View Article
PubMed/NCBI
Google Scholar

[104] View Article

[105] PubMed/NCBI

[106] Google Scholar

[ref35] 35. Strodthoff N, Wagner P, Schaeffter T, Samek W. Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL. IEEE J Biomed Health Inform 2021;25(5):1519–28. pmid:32903191
View Article
PubMed/NCBI
Google Scholar

[108] View Article

[109] PubMed/NCBI

[110] Google Scholar

[ref36] 36. He K, Zhang Z, Zhang H, Zhang Z, Xie J, Li M. Bag of tricks for image classification with convolutional neural networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:558–67.

[ref37] 37. Ismail Fawaz H, Lucas B, Forestier G, Pelletier C, Schmidt DF, Weber J, et al. InceptionTime: Finding AlexNet for time series classification. Data Min Knowl Disc 2020;34(6):1936–62.
View Article
Google Scholar

[113] View Article

[114] Google Scholar

[ref38] 38. Wang Z, Yan W, Oates T. Time series classification from scratch with deep neural networks: A strong baseline. 2017 International Joint Conference on Neural Networks (IJCNN). 2017:1578–85. https://doi.org/10.1109/ijcnn.2017.7966039

[ref39] 39. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997;9(8):1735–80. pmid:9377276
View Article
PubMed/NCBI
Google Scholar

[117] View Article

[118] PubMed/NCBI

[119] Google Scholar

[ref40] 40. Zhang S, Li Y, Wang X, Gao H, Li J, Liu C. Label decoupling strategy for 12-lead ECG classification. Knowledge-Based Syst. 2023;263110298. https://doi.org/10.1016/j.knosys.2023.110298

[ref41] 41. Qi P, Jovanovic S, Lezama J, Schweitzer P. Discrete wavelet transform optimal parameters estimation for arc fault detection in low-voltage residential power networks. Electric Power Syst Rese. 2017;143130–9. https://doi.org/10.1016/j.epsr.2016.10.008

[ref42] 42. Sang Y-F, Wang D, Wu J-C, Zhu Q-P, Wang L. Entropy-based wavelet de-noising method for time series analysis. Entropy 2009;11(4):1123–47.
View Article
Google Scholar

[123] View Article

[124] Google Scholar

[ref43] 43. Weiss K, Khoshgoftaar TM, Wang D. A survey of transfer learning. J Big Data. 2016;3(1).
View Article
Google Scholar

[126] View Article

[127] Google Scholar

[ref44] 44. Peng P, Xiang T, Wang Y, Pontil M, Gong S, Huang T, et al. Unsupervised cross-dataset transfer learning for person re-identification. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 20161306–15. https://doi.org/10.1109/cvpr.2016.146

[ref45] 45. Wang Y, Yao Q, Kwok JT, Ni LM. Generalizing from A FEW EXAMPLes. ACM Comput Surv 2020;53(3):1–34.
View Article
Google Scholar

[130] View Article

[131] Google Scholar

Figures

Abstract

Introduction

Materials and methods

Continuous wavelet transform

RL basics

RL-based WBS

RLWBS framework.

Update of policy network.

Evaluation network.

Performance metrics

Results and discussion

Performance improvement with RLWBS

Comparison of extracted wavelet features

Analysis of selected wavelet bases by RLWBS

Model interpretability through GradCAM

Limitations and future works

Conclusion

References