Automatic detection of the 1st (S1) and 2nd (S2) heart sounds is difficult, and existing algorithms are imprecise. We sought to develop a wavelet-based algorithm for the detection of S1 and S2 in children with and without pulmonary arterial hypertension (PAH).
Heart sounds were recorded at the second left intercostal space and the cardiac apex with a digital stethoscope simultaneously with pulmonary arterial pressure (PAP). We developed a Daubechies wavelet algorithm for the automatic detection of S1 and S2 using the wavelet coefficient ‘D6’ based on power spectral analysis. We compared our algorithm with four other Daubechies wavelet-based algorithms published by Liang, Kumar, Wang, and Zhong. We annotated S1 and S2 from an audiovisual examination of the phonocardiographic tracing by two trained cardiologists and the observation that in all subjects systole was shorter than diastole.
We studied 22 subjects (9 males and 13 females, median age 6 years, range 0.25–19). Eleven subjects had a mean PAP < 25 mmHg. Eleven subjects had PAH with a mean PAP ≥ 25 mmHg. All subjects had a pulmonary artery wedge pressure ≤ 15 mmHg. The sensitivity (SE) and positive predictivity (+P) of our algorithm were 70% and 68%, respectively. In comparison, the SE and +P of Liang were 59% and 42%, Kumar 19% and 12%, Wang 50% and 45%, and Zhong 43% and 53%, respectively. Our algorithm demonstrated robustness and outperformed the other methods up to a signal-to-noise ratio (SNR) of 10 dB. For all algorithms, detection errors arose from low-amplitude peaks, fast heart rates, low signal-to-noise ratio, and fixed thresholds.
Citation: Elgendi M, Kumar S, Guo L, Rutledge J, Coe JY, Zemp R, et al. (2015) Detection of Heart Sounds in Children with and without Pulmonary Arterial Hypertension―Daubechies Wavelets Approach. PLoS ONE 10(12): e0143146. https://doi.org/10.1371/journal.pone.0143146
Editor: Heye Zhang, Shenzhen institutes of advanced technology, CHINA
Received: June 25, 2015; Accepted: October 30, 2015; Published: December 2, 2015
Copyright: © 2015 Elgendi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: The heart sounds are human subject data and will be made available upon request from appropriate investigators after approval by their REB.
Funding: This study was supported by Cardiovascular Medical Research Education Fund for Pulmonary Hypertension, http://www.ipahresearch.org/, RES0016083, IA RZ DS. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
New digital stethoscopes provide opportunities to improve clinical cardiovascular diagnosis. In particular, the transformation of acoustic data coupled with machine learning algorithms offer the potential to develop automatic systems for auscultation-based diagnosis [1–4]. Pulmonary arterial hypertension (PAH) is difficult to diagnose clinically, and many patients complain of symptoms for 2 years before a diagnosis is made . Definitive diagnosis of PAH often requires an invasive cardiac catheterization which is not widely available and carries risk especially for children . Therefore, there would be considerable merit to a non invasive screening tool to aid in the diagnosis and appropriate referral of children with suspected to PAH to an expert center. Pulmonary arterial hypertension has characteristic heart sounds that make it an ideal disease to diagnose noninvasively by auscultation using machine learning algorithms . Prior to developing an automated diagnostic algorithm, it is important to accurately and reproducibly detect the first and second heart sounds (S1 and S2), which audibly separate systole from diastole. The accurate automated detection of S1 and S2 heart sounds remains a difficult problem in heart sound analysis [7, 8].
Previous attempts to detect S1 and S2 by signal processing have adopted two main approaches, which are differentiated by whether or not they need a simultaneous electrocardiographic signal (ECG). The ECG is used as a reference, and the timing between the QRS complex and the T wave in the cardiac cycle is exploited to identify S1 and S2 . However, in clinical practice, noisy ECG signals may make automated ECG wave detection difficult. Therefore, many researchers have tried to identify S1 and S2 without using a reference ECG signal [2, 9]. Several signal processing techniques have been attempted, such as artificial neural networks , decision trees , envelograms , quantified spectrograms , self-organizing maps using Mel frequency cepstrum coefficients , and pseudo affine Wigner–Ville distribution [15, 16]. The most common technique used in heart sound analysis is discrete wavelet transform ; in particular, Daubechies wavelet, as researchers with its reported its superiority for detecting S1 and S2 [3, 8].
As the choice of wavelet type is still debatable , we sought to focus only on investigating the four well-known Daubechies wavelet-based algorithms that claimed better performance in detecting heart sounds. During our analysis, we developed a novel event-related algorithm for the automated detection of S1 and S2 without using the ECG as a reference. In addition, we sought to determine the best algorithm for detecting S1 and S2 in children by quantitatively comparing the wavelet-based algorithms.
Materials and Methods
The Research Ethics Board of the University of Alberta approved the study. All subjects over 18 years of age gave informed and written consent to participate in the study. The parents, guardians or caretakers of subjects less than 18 years old gave informed and written consent for their children to participate in the study. Informed assent was obtained from children with sufficient neurodevelopmental ability.
Clinical Data Collection
We approached all pediatric subjects who were undergoing right heart cardiac catheterization that was required for management or investigation of their underlying cardiac condition, for inclusion in the study. We excluded subjects with congenitally abnormal aortic, pulmonary, and prosthetic valves.
The heart sounds were recorded using a 3M™ Littmann® 3200 digital stethoscope which works in conjunction with Zargis Cardioscan™ software (Zargis Medical Corp., Princeton, NJ, USA) to store recorded heart sounds in *. wav mono audio format. Heart sound recordings were obtained over 20 seconds with sampling frequencies of 4000 Hz. We recorded the heart sounds sequentially at the 2nd left intercostal space (2nd LICS) and the cardiac apex for 20 seconds. We used the MATLAB 2010b (The MathWorks, Inc., Natick, MA, USA) programming environment for signal analysis and optimization. Heart sounds were recorded simultaneously with the direct pulmonary arterial pressure (PAP) measurements obtained during right heart catheterization in a standard manner. Other hemodynamic data including heart rate, pulmonary artery wedge pressure or left atrial pressure, right atrial pressure, oxygen consumption (VO2), and systemic pressure and pulmonary blood flow were collected within 5–10 minutes of the acoustic recordings.
Pulmonary arterial hypertension (PAH) was defined as a mean pulmonary arterial pressure greater than or equal to 25 mmHg with a pulmonary artery wedge pressure less than 15 mmHg according to current guidelines . The heart sounds of subjects with PAH were compared with subjects undergoing cardiac catheterization but with a mean pulmonary arterial pressure less than 25 mmHg and a pulmonary artery wedge pressure less than 15 mmHg. The latter group comprised a control group with normal pulmonary arterial and wedge pressures. The clinical characteristics of these subjects have been described previously and are described in Tables 1–7 [1, 2].
Subjects #1–11 with Pulmonary arterial hypertension (mean pulmonary arterial pressure ≥25 mmHg).
Subjects #12–22 with normal pulmonary arterial pressures (mean pulmonary arterial pressure <25mmHg).
Subjects #1–11 with pulmonary arterial hypertension (mean pulmonary arterial pressure ≥25 mmHg).
Subjects #12–22 with normal pulmonary arterial pressures (mean pulmonary arterial pressure <25 mmHg).
We recorded the heart sounds in 22 subjects from 2 sites on the chest giving a total of 44 heart sound recordings, (11 subjects with mean PAP ≥ 25 mmHg and 11 subjects with mean PAP < 25 mmHg collected from two sites: 2nd LICS and apex), with a total of 1,178 heartbeats. Methods I, II, III, and IV do not require a training phase. However, Method V requires a training phase, thus, we trained the algorithm on heart sound signals collected at apex from subjects with mean PAP ≥ 25 mmHg—a total of 11 recordings.
We used all 44 heart sound recordings in Methods I, II, III, and IV. In Method V, we tested the algorithm on three datasets, heart sound signals collected on the chest at the 2nd LICS from subjects with mean PAP < 25 mmHg, heart sound signals collected at the 2nd LICS from subjects with mean PAP ≥ 25 mmHg, and heart sound signals collected at the cardiac apex from subjects with mean PAP < 25 mmHg—a total of 33 recordings.
Annotation of S1 and S2.
We demarcated S1 and S2 by identifying events from the acoustic recordings that were separated by intervals compatible with the relative duration of systole and diastole. Two cardiologists identified the timing of S1 and S2 independently. They listened to acoustic recordings and marked S1 and S2 on the phonocardiographic tracing. The cardiologists’ interpretations suggested that in all subjects studied, the duration of diastole was longer than systole. Thus, the events identified as S1 and S2 occurred such that the interval between S1 and S2 was shorter than the S2 to S1 interval. Within these demarcated events, we identified the maximal (positive or negative) normalized amplitude and annotated these as reference points for S1 or S2 events, respectively. We superimposed all the events containing S1 and S2 waves with time zero at the reference point of S1 and S2 events (Fig 1).
Wavelet-Based S1 and S2 Detection Algorithms.
We evaluated four Daubechies wavelet-based algorithms that are commonly used in the analysis of heart sound signals [20, 21]. Wavelets are closely related to filter banks. The wavelet transform (WT) of the collected heart sound signal x(t) is an integral transform defined by (1) where D(a, b) is known as the wavelet detail coefficient at scale and location indices (a, b) of the heart sound signal x(t), ψ* denote the complex conjugate of the wavelet function ψ(t). The transform yields a time-scale representation similar to the time-frequency representation of the short-time Fourier transform (STFT). In contrast to the STFT, the WT allows a variable time and frequency resolution for different frequency bands. The set of analyzing functions—the wavelet family ψa,b(t)—is deduced from a mother wavelet ψ(t) by: (2) a and b are the dilation (scale) and translation parameters respectively. The scale parameter a of the WT is comparable to the frequency parameter of the STFT. The mother wavelet is a short oscillation with zero mean; note that we investigated only Daubechies mother wavelets. Fig 2a shows the frequency representation of the detail coefficients for scale a = 6. The orthonormal dyadic discrete wavelets are associated with scaling functions and their dilation equations. The scaling function is associated with the smoothing of the signal and has the same form as the wavelet, given by ϕa,b(t) = 2-a/2ϕ(2-at−b). However, the convolution of the heart sound with the scaling function produces approximation coefficients as follows: (3)
Note, the sampling frequency of the heart sounds is 4000 Hz.
Fig 2b shows the frequency representation of the approximation coefficients for scale a = 6. The heart sound signal x(t) can then be represented with a combined series expansion using both the approximation coefficients and the detail coefficients, in discrete representation, as follows: (4)
It is clear that the original heart sound signal is expressed as an approximation of itself, at arbitrary scale index a0, added to a succession of signal details from scales a0 down to negative infinity.
The signal approximation at scale a is defined as while the signal details at . We can write (Eq 4) as (5) (6)
If we add the signal detail at an arbitrary scale (index a) to the approximation at that scale, we obtain the signal approximation at an increased resolution (i.e., at a smaller scale, index a−1). The time domain representations of detail and approximation coefficients are shown in Fig 3.
(a) Wavelet details for heart sounds with low S1 amplitude measured at the second intercostal space for a subject with mean PAp < 25 mmHg, (b) Wavelet approximations for the same heart sounds used in (a), (c) Wavelet details for heart sounds with low S2 amplitude measured at apex for a subject with mean PAp ≥ 25 mmHg, and (d) Wavelet approximations for the same heart sounds used in (c).
Method I: 2nd order Shannon energy.
Shannon energy emphasizes medium intensity signals and attenuates low energy more than high intensity signals. It is used to detect the heart sounds and to isolate successive cardiac cycles. Liang et al. [12, 13] first recommended the use of Shannon energy after comparing its performance to the Shannon entropy, absolute value, and energy of heart sounds. They developed a wavelet-based algorithm that detects S1 and S2 based on the architecture shown in Fig 4a. They calculated the average Shannon energy as follows: (7) where is the extracted heart sound detail D5 of ‘db6’ wavelet normalized to the maximum absolute value of the signal , and N is the number of samples in 20 ms segment. Then the normalized average Shannon energy () is computed as follows: (8) where μ is the mean value of E and σ is the standard deviation of E. An example is shown in Fig 5, the original heart sound data (Fig 5a) and the output signal (Fig 5b) when this step is applied to detect S1 and S2 followed by a threshold .
(a) Method I, (b) Method II, (c) Method III, (d) Method IV, (e) Method V.
(a) Original heart sound signal from a subject with mean pulmonary arterial pressure of 20 mmHg (b) second-order Shannon energy of D5 wavelet in Method I (c) second-order Shannon energy of D6 wavelet in Method II (d) third-order Shannon energy in Method III (e) wavelet approximation A6 in Method IV (f) generating blocks of interest in Method V.
Method II: 2nd order Shannon energy with durational thresholds.
Kumar et al. [8, 14] modified Method I by introducing multiple wavelet coefficients and durational threshold, as shown in Fig 4b. They extracted two wavelet features of the heart sounds: A5 and D5; and, therefore, Eq 8 is calculated twice: and . An example of the signal is shown in Fig 5c—when Method II applied to our data. An adaptive threshold THR1 is then introduced, based on the detail coefficients as follows (9) where μe is the mean value of and λ is a fixed value of 3. They used THR1 with to demarcate S1 and S2. The demarcated areas were selected based on two durational thresholds: (i) THR2: The duration of S1 and S2 sounds is not more than 250 ms and not less than 30 ms (ii) THR3: The time interval between S1 and S2 is less than 50 ms. Otherwise, any event that lies outside of this interval is considered a noisy segment, and, therefore, it will be discarded from further processing.
Method III: 3rd order Shannon energy with multiple wavelet coefficients.
Wang et al.  found Method I sensitive to noise and heart murmurs that may lead to false segmentation. Therefore, they investigated different wavelet features and introduced 3rd order Shannon energy to emphasize S1 and S2 and to suppress the noise and murmurs. The flowchart of this method is shown in Fig 4c; and the average 3rd order Shannon energy is calculated over 20 ms segments as (10)
The normalized average is calculated from Eq 8, followed by a threshold THR that equals the statistical mean of . However, the is calculated based on the summation of different wavelet coefficient features because the coefficients D3, D4, and D5 lead to a better detection accuracy. Therefore, they calculated the Shannon energy (Eq 9) with , which produces the signal shown in Fig 5d—when we tested Method III on our data.
Method IV: Sequential Wavelet Analysis.
The idea of detecting S1 and S2 using sequential wavelet analysis was introduced by Zhong and Scalzo . They developed a Daubechies wavelet-based algorithm that uses the ‘db5’ wavelet, not the ‘db6’ used in Methods I, II, and III. The algorithm consists of a two-stage wavelet based on the architecture shown in Fig 4d. At the first stage, the signal equals x = D3+D4+D5+D6, followed by a threshold THR1 = 0.2, and the output signal (y) of this stage is a binary signal based on the equality If x[n]<THR1,y[n] = 0; ELSE y[n] = 1. At the second stage, the signal Z equals the approximation coefficients A6—an example is shown in Fig 5e when applied to the collected heart sound—of the y signal, followed by a threshold THR2 = 0.1, and the output signal (r) of this stage is another binary signal based on the equality If z[n]<THR2, r[n] = 0; ELSE r[n] = 1.
Proposed Method (Method V: Event-Related Moving Averages).
We used a novel algorithm adapted from the framework proposed for detecting QRS complexes in ECG signals, systolic wave, a waves, and c, d, and e in acceleration of photoplethysmogram signals [24–26]. The same approach was used to detect S1 and S2 events. The method consists of three main stages: pre-processing (wavelet and squaring), feature extraction (generating potential blocks using two moving averages) and classification (thresholding). The structure of the algorithm is given in Fig 4e.
We selected the Daubechies wavelet base ‘db6’ because it is used most often in heart sound analysis. However, it was unclear which wavelet coefficients capture the S1 and S2 areas. Therefore, we compared the power spectrum of S1 and S2 waves with the power spectrum of the wavelet coefficients of the heart sounds (Fig 6). We found that most of the energy of S1 and S2 segments lay between 0–60 Hz. This frequency band is covered in the wavelet approximations A1, A2, A3, A4, or A5 (cf. Fig 2b).
A total of 284 heart beats used in this analysis for subjects with mean PAp > = and < 25 mmHg. Note, the sampling frequency of the heart sounds is 4000 Hz.
However, it is not completely covered in the wavelet details (cf. Fig 2a). The closest wavelet detail that contains the S1 and S2 energy is D6, with the frequency range of 20–60 Hz (Fig 6). Therefore, we investigated D6 as the optimal wavelet representations for S1 and S2 events; thus, in this method, x = D6.
Squaring emphasizes the large fluctuations (or magnitudes) resulting from the systolic wave, which suppress the small fluctuations arising from the diastolic wave and noise. This step results in the output (11) and improves the accuracy with which the systolic wave segment in heart sound signals is distinguished.
Generating Blocks of Interest.
Blocks of interest were generated using two event-related moving averages that demarcated the S1 and S2 areas. The first moving average (MApeak) is used to emphasize the S1 and S2 waves area—as shown as dotted signal in Fig 5f when applied to the collected data—and is given by (12)
Here, W1 represents the window size of the S1-peak or S2-peak duration. The resulting value is rounded to the nearest odd integer. The exact value for W1 of 130 ms is determined after a brute force search, which is explained in the results section. The second moving average (MAwave) emphasizes the beat area to be used as a threshold for the first moving average, shown as a dashed signal (Fig 5f), and is given by (13)
Here, W2 represents the window size of approximately one heart sound (S1 or S2) duration. Its value is rounded to the nearest odd integer. The exact value for W2 of 270 ms is explained later in the results section.
The equation that determines the offset level (α) is , where β = 0.03, is discussed later in the results section. The is the statistical mean of the squared filtered heart sound signal. The first dynamic threshold value is calculated by shifting the MAwave signal with an offset level α, as follows: (14)
In this stage, the blocks of interest are generated by comparing the MApeak signal with THR1, in accordance with lines 8–15 shown in the pseudocode of algorithm V (Table 8). Many blocks of interest will be generated, some will contain the heart sound feature (S1 and S2 waves), and others will contain primarily noise. Therefore, the next step is to reject blocks that result from noise. Rejection is based on the anticipated S1/S2-peak width. In this paper, the undesired blocks were rejected using a threshold called THR2, which rejects the blocks that contain diastolic wave and noise. By applying the THR2 threshold, the accepted blocks contain S1 and S2 waves only, (15)
The threshold THR2 corresponds to the anticipated S1/S2 block duration. If a block is wider than or equal to THR2, it is classified as S1 and S2 waves. If not, it will be classified as noise. The last stage is to find the maximum absolute value within each block to detect the S1 and S2 waves; the code lines of this step are lines 17–24 in the pseudocode of algorithm V.
The function that detects the first heart sound (S1) and the second heart sound (S2) waves has five inputs: the heart sound signal (HSsignal), event-related durations W1, W2, anticipated block width (BlockSize), and the offset (β). Daubechies 'db6' wavelet is used for filtering the signal and the wavelet detail D6 represents the heart sounds in the analysis.
Consecutive heart sounds are shown in Fig 5f to demonstrate the use of two moving averages to generate blocks of interest. Not all of the blocks contain potential S1 and S2 waves; some blocks are caused by noise and need to be eliminated. Blocks that are smaller than the expected width for the S1 and S2 wave duration are rejected. The rejected blocks are considered to be noisy blocks, and the accepted blocks are considered to contain either S1 or S2 waves.
The last step is to determine S1 and S2 from the detected waves, which correspond to code lines 25–32 in Table 8. We differentiated the detected waves to determine S1 and S2 based on the duration among them (line 25, Table 8). To distinguish S1 from S2 blocks, another threshold is used, based on the observation that, in our subjects, diastole is longer than systole. The detected S1 and S2 waves are compared to the annotated S1 and S2 waves to determine whether they were detected correctly. The search range for the true S1 and S2 waves is fixed to ± 5 ms for all algorithms, to ensure consistency of comparison.
We evaluated the performance of S1 and S2 wave-detection algorithms using two statistical measures: SE = 100×(TP/(TP + FN)) and +P = 100×(TP/(TP + FP)), where TP is the number of true positives (S1 and S2 waves detected as S1 and S2 waves), FN is the number of false negatives (S1/S2 waves which have not been detected), and FP is the number of false positives (non-S1/S2 waves detected as S1/S2 waves). The sensitivity SE reports the percentage of true beats that were correctly detected by the algorithm. The positive predictivity +P reports the percentage of beat detections that were true beats. The function of the S1 and S2 wave detector (cf. pseudocode of algorithm V) has five inputs: the heart sound signal (HSsignal), event-related durations W1 and W2, anticipated block width (BlockSize), and the offset (β). Any change in these parameters will affect the overall performance of the proposed algorithm. These parameters are interrelated and cannot be optimized in isolation. A rigorous optimization via brute-force search, over all parameters, is conducted (cf. Table 9). As discussed in the training subsection, the data used in this training phase were heart sound signals collected at the apex from subjects with a mean PAP ≥ 25 mmHg. During the optimization phase, the window size of the first moving average (W1) varies from 20 ms to 200 ms, whereas the window size of the second moving average (W2) varies from 30 ms to 400 ms. The offset α was tested over the range 0–10% of the mean value of the squared filtered heart sound signal. According to our investigation (Fig 1), the duration of S1 and S2 waves is roughly 120 ± 30 ms. The algorithm uses an optimal value of W1 (130 ms) corresponding to the duration of S1 and S2. The optimal values for the moving-average window sizes and offset are W1 = 130 ms, W2 = 270 ms, BlockSize = 60 ms, and α = 0.03 z¯ (Table 9). The S1 and S2 detection algorithm was adjusted with these optimal parameters. Then, the detector was tested on the three datasets mentioned in the training subsection without further adjustment. No adjustments were made after the detector was tested. The algorithm is fully automated and runs without the need for any human guidance.
All possible combinations of parameters (46,376 iterations) have been investigated and sorted in descending order according to their overall accuracy. The data used in this training phase was heart sounds measured at apex for all subjects with mean PAp ≥ 25 mmHg. The overall accuracy is the average value of SE and +P.
We collected recordings from 22 subjects (9 males and 13 females) with a median age of 6 years (range: 3 months to 19 years). Eleven subjects had a mean PAP < 25 mmHg (group 1) (range 8–24 mmHg). Elven subjects had a mean PAP ≥ 25 mmHg (group 2) (range 25–97 mmHg).
Table 10 demonstrates the performance of the described algorithms on the data collected with various mean PAP in terms of SE and +P. It also compares the advantages of each algorithm through the used wavelet coefficients, feature extraction, and thresholds. Method I introduced by Liang et al., which detects S1 and S2 peaks in heart sounds based on wavelet D5 feature, provides the highest detection rate after Method V. However, it is not optimal for detecting S1 and S2 peaks in heart sounds under varying conditions. Method I generates many FPs and FNs in detecting both S1 and S2 peaks after applying a fixed threshold THR, as shown in Fig 7a.
To evaluate the performance of the detectors, two statistical measures were used: SE = 100×(TP/(TP+FN))and +P = 100×(TP/(TP+FP)), where TP is the number of true positives (S1/S2 detected as S1/S2), FN is the number of false negatives (S1/S2 has not been detected), and FP is the number of false positives (non-S1/S2 detected as S1/S2).
The black circle represents the annotated S1/S2 wave, and the green star represents the detected S1/S2 wave using each algorithm. If the black circle is empty it means a false negative, while the red circle means a false positive.
Method II was the least accurate of the five algorithms, mainly because of the three thresholds. The first threshold () shifts the feature, in most of the cases, above the A5 signal as shown in Fig 7b; thus, it provided the largest number of FPs and FNs. It seems that the use of a dynamic threshold that depends on the processed heart sound is a better idea than using a fixed threshold, as used in Method I. However, the application of the THR1 was unsuccessful and needs more investigation. Method III and Method IV generally incurred the same set of errors as Method I in our study, but with a smaller number of FNs, using fixed thresholds, as shown in Fig 7c and 7d. However, the proposed algorithm (Method V) scored the highest sensitivity and positive predictivity rates among the five algorithms. Two event-related moving averages, as shown in Fig 7e, may be more efficient than the THR1 threshold introduced in Method II and the fixed thresholds introduced in Methods I, III and IV. The proposed algorithm appears to be more robust against effects of non-stationarity, fast heart rate, and low SNR. However, the algorithm failed if S1 or S2 were low amplitude. In such cases, applying a simple level threshold is not an effective approach. The proposed Method V, however, handles varied amplitudes better than the other four algorithms and may be more amplitude-independent.
The purpose of our research is to develop a non-invasive screening tool for use by inexperienced clinical staff so that PAH may be detected early and result in more timely and appropriate referral to a specialist center for further evaluation by echocardiography and cardiac catheterization.
The main findings of our investigation were that a new Daubechies-based algorithm using event-related moving averages detected S1 and S2 robustly and accurately. The developed algorithm performed better than the fixed threshold methods. Moreover, it is apparent (cf. Figs 2 and 6) that the other methods did not focus on the frequency range of S1 and S2, which considerably reduced the overall performance. The sensitivity and positive predictivity of our algorithm were 69.84% and 67.87% respectively. In comparison, the sensitivity and positive predictivity of Liang’s algorithm were 58.91% and 41.82%, Kumar’s algorithm 18.88% and 11.93%, Wang’s algorithm 49.80% and 44.86%, and Zhong’s algorithm 42.68% and 52.49%. In addition, our algorithm outperformed the discussed methods up to a signal-to-noise-ratio (SNR) of 10 dB.
Our main objective was to evaluate the robustness of the algorithms against the low SNR and high heart rates of children with a mean PAP ≥ 25 mmHg. The heart sounds analysis was difficult because heart sound amplitudes varied with time, and simple level thresholds did not optimally detect S1 and S2 waves. This reduced the algorithm detection performance. Also, heart sounds may be distorted with low and high frequencies because of breathing, muscle movements, and ambient noise; however, choosing the most appropriate wavelet coefficients may minimize these confounding effects.
All of the algorithms failed to some degree. Possible reasons for this included false-negative detection of the heart sounds because of tachycardia or the fixed search range of 5 milliseconds. Alternatively there may have been false-positive detection of the heart sounds because of a decrease in amplitude (S1 has a lower amplitude than S2 or vice versa) or a low SNR.
The robustness of the five algorithms was tested against additive white Gaussian noise contamination. All heart sounds were tested against varying signal-to-noise ratios. The white noise was added above the noise collected during the heart sound measurement. Fig 8 shows the performance of the five algorithms at varying noise levels. As expected, the performance of the algorithms degraded with a decrease in signal-to-noise ratio. However, the study showed that Method II is very sensitive to noise while the proposed algorithm (Method V) outperformed the other methods up to SNR of 10 dB. The accuracy of the reported algorithms, when applied to our clinical data, is lower than has been reported in the literature. There are many factors that might account for this including the sensor, sampling frequency, and algorithm implementation. For example Kumar et al. used a Meditron stethoscope, which has a good signal to noise ratio and an extended frequency range digitized using a 16-bit analogue to digital converter at 44.1 kHz [8, 14]. Liang et al. used a customized electronic stethoscope with 16-bit accuracy using 11025 Hz sampling frequency [7, 12, 13]. Perhaps their sensors were more robust to noise compared to ours, and, therefore, the S1 and S2 waves were clearer. In addition, our stethoscope sensor has a lower sampling frequency of 4000 Hz. One of the main reasons behind the relatively low accuracy may be the implementation of the discussed algorithms. The last step of each algorithm, which determines if the detected peak could be considered as an S1 or S2 wave has not undergone robust discussion in the literature. Based on the training and testing phases (cf. 9 and 10), the proposed algorithm performed better than the other algorithms at detecting S1 and S2 peaks in all subjects with mean PAP ≥ 25 mmHg and mean PAP < 25 mmHg at both the cardiac apex and 2nd LICS.
It is clear that the overall accuracy of Method V increases when the SNR increases compared to the other methods.
All of the algorithms we evaluated do not impose an extensive computational overhead while avoiding the manual segmentation and patient-specific modifications that are often required in biosignal analysis.
As far as we are aware the comparison of different algorithms and their performance on a single data set has not been reported.
Although our long-term goal is to non-invasively detect PAH this was not the main focus of the work described in the current manuscript. In this paper we attempted to evaluate the automatic detection of S1 and S2 in subjects with and without PAH. Our next step would be to use the heart sounds in particular the characteristics of S2 to diagnose PAH in a similar manner to clinicians who determine the presence of PAH by the behavior of the 2nd heart sound. However, an open question for future studies is to explore the minimum number of accurately identified S1 and S2 events for automatic detection of PAH from the acoustic behavior of the heart sounds.
A Daubechies-based algorithm using event-related moving averages detected S1 and S2 with robustness and accuracy. The developed algorithm performed better than the fixed threshold methods. An algorithm to detect S1 and S2 waves in heart sounds measured from children with and without pulmonary arterial hypertension has not been addressed in the literature. We have developed a robust algorithm for detecting S1 and S2 peaks in the heart sounds of children with low amplitude, non-stationary effects, and high heart rates. The algorithm was evaluated using 44 records, containing 1,178 heartbeats, with an overall sensitivity of 69.84% and positive predictivity of 67.87%. Based on our spectral analysis, we recommend the use of wavelet ‘D6’ detail for detecting S1 and S2 waves because it captured most of the energy contained within S1 and S2.
Conceived and designed the experiments: IA DS RZ ME. Performed the experiments: IA SK ME LG. Analyzed the data: IA DS ME SK LG RZ. Wrote the paper: IA DS RZ JR JC ME LG SK. Undertook cardiac catheterizations: IA JC JR.
- 1. Elgendi M, Bobhate P, Jain S, Guo L, Rutledge J, Coe Y, et al. Spectral analysis of the heart sounds in children with and without pulmonary artery hypertension. Int J Cardiol. 2014;173(1):92–9. pmid:24630336.
- 2. Elgendi M, Bobhate P, Jain AP, Guo L, Rutledge J, Coe JY, et al. Time-domain analysis of heart sound intensity in children with and without pulmonary artery hypertension: a pilot study using a digital stethoscope. Pulm Circ. 2014;4(4):685–95. pmid:25610604
- 3. El-Segaier M, Pesonen E, Lukkarinen S, Peters K, Sornmo L, Sepponen R. Detection of cardiac pathology: time intervals and spectral analysis. Acta Paediatr. 2007;96(7):1036–42. Epub 2007/05/26. APA318 [pii] pmid:17524025.
- 4. Sepehri AA, Hancq J, Dutoit T, Gharehbaghi A, Kocharian A, Kiani A. Computerized screening of children congenital heart diseases. Computer methods and programs in biomedicine. 2008;92(2):186–92. pmid:18718691.
- 5. Rich S, Dantzker DR, Ayres NA. Primary pulmonary hypertension: a national prospective study. Ann Int Med. 1987;107:216–23. pmid:3605900
- 6. Bobhate P, Guo L, Jain S, Haugen R, Coe JY, Cave D, et al. Cardiac catheterization in children with pulmonary hypertensive vascular disease. Pediatr Cardiol. 2015;36(4):873–9. pmid:25577228.
- 7. Liang H, Lukkarinen S, Hartimo I. A heart sound segmentation algorithm using wavelet decomposition and reconstruction. Proceedings—19th international Conference—iEEE/EMBS; Oct. 30—Nov. 2, 1997; Chicago, IL, USA1997. p. 1630–3.
- 8. Kumar D, Carvalho P, Antunes M, Henriques J, Eugenio L, Schmidt R, et al. Detection of S1 and S2 heart sounds by high frequency signatures. Conf Proc IEEE Eng Med Biol Soc. 2006;1:1410–6. Epub 2007/10/20. 10.1109/IEMBS.2006.260735. 17946890.
- 9. Sepehri AA, Gharehbaghi A, Dutoit T, Kocharian A, Kiani A. A novel method for pediatric heart sound segmentation without using the ECG. Computer methods and programs in biomedicine. 2010;99(1):43–8. pmid:20036439.
- 10. Hebden J, Torry J. Neural network and conventional classifiers to distinguish between first and second heart sounds. Proc IEE Colloquium on Artificial Intelligence Methods for Biomedical Data Processing; April 26th, 1996; London, UK1996. p. 3/1-3/6.
- 11. Stasis A, Loukis E, Pavlopoulos S, Koutsouris D. Using decision tree algorithms as a basis for a heart sound diagnosis decision support system. Proc 4th International IEEE EMBS Special Topic Conference on Information Technology Applications in Biomedicine2003. p. 354–7.
- 12. Liang H, Lukkarinen S, Hartimo I. Heart sound segmentation algorithm based on heart sound envelogram. Proc 4th International IEEE EMBS Special Topic Conference on Information Technology Applications in Biomedicine; Sep 7–10, 1997; Lund, Sweden1997. p. 105–8.
- 13. Liang H, Lukkarinen S, Hartimo I. A boundary modification method for heart sound segmentation algorithm. Computers in Cardiology. 1997;24:105–8.
- 14. Kumar D, Carvalho P, Antunes M, Henriques J, Eugenio L, Schmidt R, et al. A new algorithm for detection of S1 and S2 heart sounds. Proc 2006 IEEE International Conference on Acoustics, Speech and Signal Processing; May 14–19, 2006; Toulouse, France2006. p. 1180–3.
- 15. Djebbari A, Bereksi-Reguig F. Detection of the valvular split within the second heart sound using the reassigned smoothed pseudo Wigner-Ville distribution. Biomed Eng Online. 2013;12:37. pmid:23631738; PubMed Central PMCID: PMC3706289.
- 16. Gavrovska A, Bogdanovic V, Reljin I, Reljin B. Automatic heart sound detection in pediatric patients without electrocardiogram reference via pseudo-affine Wigner-Ville distribution and Haar wavelet lifting. Computer methods and programs in biomedicine. 2014;113(2):515–28. pmid:24418438.
- 17. Emmanuel BS. A review of signal processing techniques for heart sound analysis in clinical diagnosis. J Med Eng Technol. 2012;36(6):303–7. pmid:22794305.
- 18. Ning T, Hsieh KS. Automatic heart sounds detection and systolic murmur characterization using wavelet transform and AR modeling. Conf Proc IEEE Eng Med Biol Soc. 2013;2013:2555–8. pmid:24110248.
- 19. Hoeper MM, Bogaard HJ, Condliffe R, Frantz R, Khanna D, Kurzyna M, et al. Definitions and diagnosis of pulmonary hypertension. J Am Coll Cardiol. 2013;62(25 Suppl):D42–50. pmid:24355641.
- 20. Debbal SM, Bereksi-Reguig F. Computerized heart sounds analysis. Comput Biol Med. 2008;38(2):263–80. pmid:18037395.
- 21. Nigam V, Priemer R. Accessing heart dynamics to estimate durations of heart sounds. Physiol Meas. 2005;26(6):1005–18. pmid:16311448.
- 22. Wang X, Li Y, Sun C, Liping L. Detection of the first and second heart sound using heart sound energy. Proc 2nd International Conference on Biomedical Engineering and Informatics; October 17-19th, 2009; Tianjin, China2009. p. 1–4.
- 23. Zhong J, Scalzo F. Automatic heart sound signal analysis with reused multi-scale wavelet transform. Int J Eng Sci. 2013;2(7):50–7.
- 24. Elgendi M, Norton I, Brearley M, Abbott D, Schuurmans D. Detection of a and b waves in the acceleration photoplethysmogram. BioMedical Engineering OnLine. 2014;13. BioMedical Engineering OnLine 2014, 13:139 pmid:25252971
- 25. Elgendi M. Detection of c, d, and e waves in the acceleration photoplethysmogram. Computer methods and programs in biomedicine. 2014;117(2):125–36. pmid:25176597.
- 26. Elgendi M. Fast QRS detection with an optimized knowledge-based method: evaluation on 11 standard ECG databases. PloS one. 2013;8(9):e73557. pmid:24066054; PubMed Central PMCID: PMC3774726.