A clustering-based method for single-channel fetal heart rate monitoring

Non-invasive fetal electrocardiography (ECG) is based on the acquisition of signals from abdominal surface electrodes. The composite abdominal signal consists of the maternal electrocardiogram along with the fetal electrocardiogram and other electrical interferences. These recordings allow for the acquisition of valuable and reliable information that helps ensure fetal well-being during pregnancy. This paper introduces a procedure for fetal heart rate extraction from a single-channel abdominal ECG signal. The procedure is composed of three main stages: a method based on wavelet for signal denoising, a new clustering-based methodology for detecting fetal QRS complexes, and a final stage to correct false positives and false negatives. The novelty of the procedure thus relies on using clustering techniques to classify singularities from the abdominal ECG into three types: maternal QRS complexes, fetal QRS complexes, and noise. The amplitude and time distance of all the local maxima followed by a local minimum were selected as features for the clustering classification. A wide set of real abdominal ECG recordings from two different databases, providing a large range of different characteristics, was used to illustrate the efficiency of the proposed method. The accuracy achieved shows that the proposed technique exhibits a competitve performance when compared to other recent works in the literature and a better performance over threshold-based techniques.


Introduction
The early detection of defects in the fetal heart is of paramount importance for the management of pregnancy and childbirth timing. In addition, it can help in the diagnosis of possible abnormalities in other organs. Among all fetal heart problems, heart rhythm abnormalities [1] occur in up to 2% of pregnancies and account for 10-20% of the referrals to fetal cardiologists [2]. Therefore, fetal heart rate (FHR) measurement is integral to fetal surveillance throughout pregnancy, as it is of significant clinical importance. Fetal electrocardiography [3,4] can be used for the detection of the FHR before birth and thus makes it possible to administer faster medical or surgical interventions once the baby is born, if required. Noninvasive fetal a1111111111 a1111111111 a1111111111 a1111111111 a1111111111

Threshold-based FQRS detection background
The main stages of the threshold-based, FQRS-detection method for FHR monitoring in [19] are wavelet-based preprocessing, maternal and fetal QRS detection, and false positive (FP) and false negative (FN) correction. The wavelet preprocessing method [14,19], simultaneously eliminates baseline wandering (BW) and noise using only one wavelet decomposition and reconstruction structure. The application of the wavelet preprocessing approach consists of wavelet decomposition down to L levels [14], with the approximation coefficients at level L replaced by an all-zero vector. Additionally, for each level from i = 1 to M (with M < L), the appropriate threshold limit and rule [14] are applied to the detail coefficientsṪhe wavelet reconstruction based on the zeroing approximations of level L, the modified details of levels 1 to M, and the original details of levels M + 1 to L are computed to obtain the BW-corrected and denoised AECG signal.
The maternal and fetal QRS detection is based on the localization of maternal and fetal QRS complexes by means of thresholds [19]. For maternal QRS detection, the maximum of the preprocessed signal is identified as the R-peak of greatest amplitude in the maternal QRS complexes, which is used as referecence for identifying maternal QRS complexes as a certain percentage of this maximum. For fetal QRS detection, a search for local maxima (R-peak candidates) followed by a local minimum (S-peak candidate) between each two maternal QRS complexes is carried out. Time and amplitude thresholds are also applied to this search, resulting in RS-peak candidates. Time thresholds are related to extremely low FHR, while the amplitude threshold is established within a range related to the maximum of MQRS complexes after BW and noise suppression. Candidate RS-peaks meeting these two thresholds will be stored as FQRS candidates.
The FHR monitoring finally includes a method for the detection of false negatives (FN, a non-detected FQRS complex) and false positives (FP, a false-detected FQRS complex) [19] [23]. For each FQRS candidate, several RR time distances are defined and checked, and FPs and FNs are detected according to heart rate limits [19].
Some of the steps required by the method proposed in [19] are based on the selection or calculation of thresholds. These thresholds themselves constitute a major disadvantage for these methods, as it is difficult to ensure that any selected or calculated threshold is the most appropriate for the signal under study. Thus, amplitude thresholds have to be either selected according to certain known parameters (such as pregnancy week, position of the fetus or electrode positions) or manually adapted according to the characteristics of each signal. This is usually the case when working with different public databases, as it has been found that it is difficult to establish a fixed relationship between different thresholds and signal parameters, even when the data are normalized, since thresholds also depend on unknown parameters (from the noise affecting the abdominal signal to biological characteristics). Moreover, fetal QRS complexes may be comparable in amplitude to maternal QRSs. This all is a significant barrier to automating threshold-based methods for real-time FHR monitoring. Therefore, new alternatives were evaluated for maternal and fetal QRS extraction in order to avoid these difficulties, with clustering techniques [21] being one of the best candidates. The new method uses a clustering procedure to classify certain information obtained from the denoised abdominal ECG signal into three clusters: maternal RS-peaks, fetal RS-peaks and other waves.

Fetal ECG datasets
PhysioNet [24] offers free and public web access to large collections of recorded physiologic signals (PhysioBank) and the included databases are made available under the ODC Public Domain Dedication and License v1.0 [25]. Concretely, two of these databases provided by Physionet [24] have been used for this work: • Abdominal and Direct Fetal Electrocardiogram Database [24] [26]: this database contains multichannel FECG recordings obtained from 5 different women in labor. Each recording comprises four 5-minute differential signals acquired from maternal abdomen and the reference direct fetal electrocardiogram registered from the fetal head. The recordings are sampled at 1 ksps with 16-bit resolution, and the signal bandwidth is 1-150 Hz. Moreover, the database includes a set of reference annotations indicating the fetal R-wave locations.
• Challenge 2013 Training Set A [24] [27]: these data consist of one-minute fetal ECG recordings, sampled at 1 ksps, each one including four noninvasive abdominal signals as well as the reference annotations marking fetal R-waves locations.

Clustering fundamentals
Clustering [21,22] is one of the most important unsupervised learning techniques for the classification of objects into clusters. A cluster is therefore a collection of similar objects that are dissimilar to the objects belonging to other clusters. One possible measure of relative dissimilarity is the distance D(x i , x j ) between two objects x i and x j . Therefore, given a clustering algorithm and according to the distance function, a group X of n objects, A distance function over a data group X is defined to satisfy the reflexivity, symmetry, positivity and Minkowski's inequality conditions [28]. The Minkowski distance comprises a family of metrics defined traditionally to measure distances. The Manhattan, Euclidean, and Chebyshev distances [29] are special cases of the Minkowski distance when p = 1, p = 2 and p ! 1, respectively.
A distinction among different types of clusterings is whether the set of clusters is nested or unnested. Thus, clustering may be classified as hierarchical clustering [30] or partitional clustering [31]. Partitional clustering is popular in various research fields [22] due to its capability to cluster large datasets, and this is the clustering type used in the present work. It divides the set of data objects into non-overlapping clusters, given certain criteria, with each cluster represented by its centroid. The k-means algorithm [32] is the most fundamental partitional clustering concept, the optimization criterion of which is the minimization of the Euclidean distance between elements and cluster. Inspired by k-means, several gradient algorithms for partitional clustering have been developed by researchers, for example, k-means++, which we selected as the clustering algorithm to classify the information extracted from AECG signals.

Clustering-based procedure for FHR extraction
A schematic of the new clustering-based procedure is shown in Fig 1. The procedure is mainly composed of three stages, corresponding to each frame in the figure: two of the previously developed stages, the wavelet-based preprocessing stage, the FP and FN correction stage, and the new clustering-based method for fetal QRS extraction. This new stage is fed with the denoised AECG signal, and after extracting the fetal QRS complexes, it sends them to the FP and FN correction stage. The proof of concept has been implemented using MATLAB and validated using real AECG recordings from the fetal ECG datasets mentioned above. The different steps of this clustering stage are illustrated in Fig 1 making use of the r04 Ab-2 recording of the Abdominal and Direct Fetal Electrocardiogram Database. These steps are described in detail below.
Step 1: Extraction of signal features. Distinctive features of the denoised AECG signal must be selected for the application of the clustering algorithm. This selection is crucial for the effectiveness of the clustering application. For this task, it is necessary to analyze the signal and to identify the points or zones of this signal that contain the information of interest. Our objective is to extract the FHR, which requires localizing the fetal QRS complexes. Analyzing the denoised AECG signals, it can be observed that for the identification of fetal QRS complexes, the most noticeable characteristics are the RS-peaks. From the QRS morphology, it can be observed that the RS-peak is a local maximum followed by a local minimum, as shown in Fig 1, at the extraction of signal features step. In this way, a search for local maxima followed by a local minimum in the preprocessed AECG signal is made, resulting in the max-min points. Hence, the objective of our clustering classification consists of using certain features of these max-min points to classify them adequately into three clusters. Analysing the 5-minute recordings of the training database and their main characteristics, the amplitude and time distances of these max-min points have been selected as candidate features, and two different scenarios have been considered: • Scenario 1: This scenario considers signal windows where the maternal RS amplitudes are larger than the fetal RS amplitudes. This scenario allows for the use of the amplitude distance between the max-min points as the data to be classified. Fig 2 shows a representation of the amplitude distance between the max-min points located in a set of 10,000 samples from the r08 Ab-1 denoised recording from the Abdominal and Direct Fetal Electrocardiogram Database [24], [26].  • Scenario 2: This other scenario considers signal windows where fetal RS amplitudes are similar to maternal RS amplitudes so maternal and fetal RS-peaks would likely be classified in the same cluster. It was observed from the AECG signals that the maternal RS time distance is sufficiently larger than the fetal RS time distance to allow for visual differentiation between the fetal and maternal RS-peaks. Accordingly, the amplitude distance would allow for the differentiation of maternal and fetal RS-peaks from noise and other wave information. Meanwhile, the RS time distance, or the RS number of samples, is an important parameter that allows us to distinguish between fetal and maternal QRS complexes.
After carrying out a detailed study, we found that very satisfactory classification results are achieved using the amplitude multiplied by the number of samples of max-min points as the data to be classified. Step 2: Signal feature selection. In order to automatically detect which of the scenarios detailed above corresponds to the signal or, more specifically, to the data window to be processed, we have used a procedure based on the distribution of the amplitudes of the detected max-min points: • Calculate the maximum value of the amplitudes of the max-min points, called MA.
• Divide the range from 0 to MA into 50 intervals, which was found to be an optimum partitioning over a wide range of training sets.
• Calculate the number of amplitudes that are within each interval, i.e., calculate the distribution of amplitudes.
• Normalize the distribution of amplitudes by dividing them by the total number of the maxmin points.
• Apply a smoothing filter to the normalized distribution of amplitudes. Fig 4a shows the filtered normalized distribution of amplitudes obtained for a 50,000-sample window of the denoised r01 Ab-1 AECG signal.
• Detect the local maxima and local minima of the filtered normalized distribution of amplitudes. In general, from these points, we have to identify three different cases: • Case 1: Detection of two local maxima from the first local minimum. Each of these maxima provides information about the amplitude zones corresponding to fetal RS-peaks and maternal RS-peaks. After the data training, it was concluded that if the distance between these maxima, called dmax-max, is 35% greater than the MA value, we are under Scenario 1. In this case, amplitudes are the selected data to be classified. • Case 2: Detection of two local maxima from the first local minimum, the distance between which is 35% less than the MA value. This situation can be related to Scenario 2, and thus, amplitudes multiplied by the number of samples of the max-min points are data to be classified. An example of this case is presented in Fig 4b. • Case 3: Detection of a single local maximum from the first local minimum. This situation generally corresponds to Scenario 2, as amplitudes for fetal and maternal RS-peaks are so similar that they are located in only one zone, with the located maximum being representative of this zone. For this case, amplitudes multiplied by the number of samples of the  For all other cases, the amplitude of the max-min points was selected as the data to be classified.
Step 3: Clustering classification. After having studied the features of the signal and having defined the data to be classified, it was necessary to choose the clustering algorithm to be used and its parameters. Taking into account the results obtained from a preliminary study [33], we have focused our interests on k-medoids++. We have used the kmedoids function from the Statistics and Machine Learning Toolbox in MATLAB. The main input data of this function are: • Data to be classified: these data are selected from the previous step.
• Number of clusters: this is equal to 3, as the data have to be classified into three clusters: maternal RS-peaks, fetal RS-peaks and other waves and noise.
• Distance measure: the dissimilarity between the data have to be measured using distance functions, which were introduced in the previous subsection. Taking into account that the data to be classified are unidimensional, for d = 1 Minkowski, Manhattan, Euclidean and Chebyshev distances are all defined as D(x i , x j ) = jx i − x j j. In clustering algorithms, it is common to use an alternative measure of the Euclidean distance, the squared Euclidean distance, 2 , which places progressively greater weight on objects that are farther apart. This squared Euclidean distance is not a metric but is frequently used in optimization problems in which distances only have to be compared. Thus, the squared Euclidean distance was applied for our classification.
• Number of times to repeat the clustering using new initial cluster medoid positions: this value was experimentally fixed to 20 using training sets.
• Method for choosing the initial cluster medoid positions: as the k-medoids++ clustering algorithm was chosen, this input was selected as 'plus' (++).
The kmedoids function returns a vector containing cluster indices that were used to separate these data into three clusters. The cluster corresponding to the fetal RS-peaks was identified as that with a median value situated between the median values of the other two clusters.
Step 4: Classification improvement. A final stage to improve classification results is carried out, mainly consisting of the imposition of some limits on the amplitude and time distance of the data classified as fetal RS-peaks. These limits are calculated using the local points detected from the filtered normalized distribution of amplitudes. Thus, the first graphic in Fig  5 shows the max-min points of each resulting cluster with different colors, with blue points corresponding to maternal RS-peaks and red points to fetal RS-peaks, while green points are related to other waves. The second graphic displays the denoised signal and the detected fetal RS-peaks (red points). The marked max-min point (black circle) was classified as a fetal RSpeak, but it is not fetal, as it corresponds to other waves. As can be observed in the graphic of the classification improvement step, this max-min point is eliminated from the fetal cluster, thus showing the effectiveness of this improvement step. Different recordings were used as a training set to establish the optimal value of the number of samples in this data window for the clustering classification. This study indicated that in general, from 10,000-to 60,000-sample windows, there was no noticeable difference between the effectiveness of the classification results. From this value range, we have selected 50,000-sample windows. It should be noted that Fig 5 uses a 10.000-sample window to clearly show the detected fetal R-peak. Fig 6 shows the results of k-medoids++ classification for a 50,000-sample window and the denoised signal, including the max-min points classified as fetal RS-peaks (red dots) and the annotations of the database (black circles). For this signal, the amplitude of the max-min points was the automatically selected feature for the clustering classification.   Finally, the data identified as fetal RS-peaks are ready to be processed by the FP and FN correction stage [19].

Results and discussions
The parameters selected to assess the performance of this clustering-based proposal are sensitivity, Se, positive diagnostic value, PPV, accuracy, Acc [23], and F 1 -measure [34]: where TD means true detected fetal QRS complexes, and FP and FN are false negatives and false positives, respectively, that are found after the FP and FN correction stage. To decide when an obtained FQRS complex corresponds to a true FQRS complex, a fixed criterion is set. Thus, we discard all candidates differing by more than 50 ms from the reference annotation and mark them as FPs. Thus, Se, PPV and Acc are calculated from the fetal QRS complexes extracted by the proposed method having the annotations by the specialists in the databases as references.
The Abdominal and Direct Fetal Electrocardiogram Database, which was used to train the clustering-based stage, was also used to train the complete method, since it includes R-wave markers that allow the calculation of accuracy parameters. When reviewing other works in the field [8][9][10]13], authors neglect recordings where the FECG is hardly detectable or make a selection of some recordings and channels for their studies. For this work, a medical specialist determined the clinical interest of the AECG signals for our study. The cardiology specialist thus excluded from the study some of the AECG abdominal signals where fetal heart beats were not detectable, and signals that were affected by severe artifacts, and even saturation, or were made of no clinical interest by severe noise. These excluded signals were r04 Ab-1, r07 Ab-1 and r10 Ab-3. At the same time, the rest of the database signals, all included in the study, were classified in two groups: first, those with no evident problems or artifacts (group 1), opposed to a second group of recordings presenting segments with big or repetitive artifacts or segments where it is extremely difficult to differentiate maternal from fetal PQRS complexes or to detect fetal complexes (group 2). On the other hand, the selected parameters for the wavelet preprocessing stage were wavelet function db6, M = 3, universal threshold, single rescaling and soft thresholding, with L = 7 [14,19]. Table 1 presents the results obtained for the selected recordings of this dataset, classified in the two groups discussed above. Additionally, Table 1 also includes three different statistical summarizations of results: total data for all the analysed recordings, data for only recordings in group 1 (not affected by severe artifacts), and data for the best performing channel in each recording (shown in bold characters in Table 1), since some authors select for each recording a representative channel for the statistical summarizations of results [9,10,13]. Thus, selecting the best performing channel in each recording as representative data, Se, PPV, Acc and F 1 are 98.40%, 98.86%, 97.30% and 98.63%, respectively, when the proposed clustering-based classification is applied. These performance data comprise a total of 3,182 FQRSs, of which 51 were not detected (FN 1.60%) and 36 were falsely detected as FQRS (FP 1.13%). When all the signals in Group 1 in Table 1 are considered, with a total of 6,990 FQRSs resulting in 175 FNs (2.50%) and 98 FPs (1.40%), Se, PPV, Acc and F 1 are 97.50%, 98.58%, 96.15% and 98.04%, respectively. These results validate the proposed procedure and its capabilities. Fig 8 shows an example of FHR monitoring of the r08 Ab-4 AECG signal. In the first subplot, it can be seen that the proposed method is able to detect changes in the FHR with high accuracy. The second subplot shows 20 seconds of the denoised signal, including the detected FQRS complexes (red dots) and fetal annotations (black circles).
The Challenge 2013 Training Set A was used as testing data. As for the trainig database, a cardiology specialist selected a set of recordings from this testing database. All parameters for the wavelet preprocessing stage were the same as those detailed for the Abdominal and Direct Fetal Electrocardiogram Database. The results obtained from the 64 recordings selected by the specialist are summarized in Table 2. Moreover, Table 2 also includes two different statistical summarizations of results: data for all the analysed recordings and data for the best performing channel in each recording (shown again in bold characters in Table 2). The Se, PPV, Acc and F 1 for the best performing channels are 97.93%, 99.11%, 97.08% and 98.52%, respectively. It is important to note that the processed signals include a wide range of different characteristics,  and some include artifacts and very noisy fragments that were not eliminated from the results in Table 2. Even so, the obtained accuracy corroborates the efficiency of the method. It is shown that apart from the wandering and noise, the FHR extraction method has a high accuracy. In the 7-9s interval, the classification improvement step and FP and FN correction step significantly improve the results from the clustering classification. This verifies the performance of our method in noisy scenarios of typical recordings. In addition, the 1-minute FHR monitoring of the signal a49 Ab-2 is presented in Fig 10b, where high efficiency can be also noted (100%). Moreover, the natural fluctuations in the FHR can be also noted. These results validate the proposal and provide evidence of its capabilities. The presented proposal is thus able to perform robust fetal QRS detections for a total of 55 minutes for the Abdominal and Direct Fetal Electrocardiogram Database and 64 min for the Challenge 2013 Training Set A. The obtained results can be compared to those obtained from some recent single-channel fetal ECG extraction methods, including sequential total variation denoising (STVD), extended Kalman filter (EKF), template subtraction principle component analysis (TSPCA) and total variation denoising (TVD) combined to TSPCA (TVD+TSPCA) as proposed in [35], template substraction principle component analysis (TSpca), least mean square (LMS), recursive least square (RLS) and state neural network (ESNa), as proposed in [34], template adaption (TA) and extended Kalman smoother (EKS), as proposed in [36], extended Kalman smoother (EKS) combined to differential evolution (DE) and adaptive neuro fuzzy inference system (ANFIS) (EKS+DE+ANFIS) and EKF combined to DE and ANFIS (EKF+DE+ANFIS) as proposed in [8], singular value decomposition (SDV) combined to smoothd windows (SW) (SDV+SW) as proposed in [9], and complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) as proposed in [13]. Results are also compared to the threshold-based method (THR) proposed in [19]. This comparison is summarized in Table 3, where the recordings used for each method are indicated. Table 3 shows that the results obtained by the proposed FHR monitoring model exhibit improved success rates compared to those including a significant number and duration of recordings, such as STDV, EKF, TVD+TSPCA, TSPCA, TSLpca, LMS, RLS, ESNa, EKS+DE+ANFIS and EKF+DE+ANFIS methods. This good performance of the presented work compared to these approaches reaffirms its validity. The SDV+SW and CEEMDAN methods show a very high efficiency, but it must be noted that the results for the SDV+SW method are based on only 3 recordings, while results for the CEEMDAN method include only a few recordings of limited length, 30 and 40 seconds. It must be also noted that for our method, the testing database results are similar to those obtained for the training database, which shows the strength of our proposal.

Conclusions
In this paper, we propose a new method for fetal QRS extraction from single-channel abdominal ECGs of pregnant women. The proposal combines a previously developed technique for wavelet-based denoising with a new clustering-based procedure for the extraction of fetal QRS complexes, along with a final stage for FP and FN correction. This clustering procedure locates candidate RS-peaks as local maxima of the denoised signal followed by a minimum and extracts the amplitude and time distance features of these candidate peaks. These features are used for the classification of candidate peaks into three clusters: maternal RS-peaks, fetal RSpeaks and other waves. A classification improvement step based on the amplitude distribution is also applied to increase efficiency. Parameters for the classification steps were optimized through the analysis of the Abdominal and Direct Fetal ECG Database from PhysioNet. Further testing and validation of the methodology has been carried out using real-life benchmark clinical recordings in the Challenge 2013 Training Set A dataset from PhysioNet. The obtained results illustrate how the proposed method achieves high accuracy in fetal QRS extraction for both training and testing databases, comprising a total of 119 minutes of abdominal data. It must be noted the fact that, while method parameters where tuned for the training recordings, results for the testing database are also highly accurate. This improves the applicability of the proposed method when compared to the threshold-based technique proposed in [19], while also enhancing the average accuracy in the detection of fetal QRS complexes. Comparisons to other existing FHR extraction methods also corroborate the high efficiency of the proposal. Finally, the good performance of the proposed method using a single parameter set over two different databases may allow the automated use of this method, which may also be proved to be useful for FHR monitoring in real-life clinical conditions with noisy abdominal ECG recordings.