A novel diversity method for smartphone camera-based heart rhythm signals in the presence of motion and noise artifacts

The advent of smartphones has advanced the use of embedded sensors to acquire various physiological information. For example, smartphone camera sensors and accelerometers can provide heart rhythm signals to the subjects, while microphones can give respiratory signals. However, the acquired smartphone-based physiological signals are more vulnerable to motion and noise artifacts (MNAs) compared to using medical devices, since subjects need to hold the smartphone with proper contact to the smartphone camera and lens stably and tightly for a duration of time without any movement in the hand or finger. This results in more MNA than traditional methods, such as placing a finger inside a tightly enclosed pulse oximeter to get PPG signals, which provides stable contact between the sensor and the subject’s finger. Moreover, a smartphone lens does not block ambient light in an effective way, while pulse oximeters are designed to block the ambient light effectively. In this paper, we propose a novel diversity method for smartphone signals that reduces the effect of MNAs during heart rhythm signal detection by 1) acquiring two heterogeneous signals from a color intensity signal and a fingertip movement signal, and 2) selecting the less MNA-corrupted signal of the two signals. The proposed method has advantages in that 1) diversity gain can be obtained from the two heterogeneous signals when one signal is clean while the other signal is corrupted, and 2) acquisition of the two heterogeneous signals does not double the acquisition procedure but maintains a single acquisition procedure, since two heterogeneous signals can be obtained from a single smartphone camera recording. In our diversity method, we propose to choose the better signal based on the signal quality indices (SQIs), i.e., standard deviation of instantaneous heart rate (STD–HR), root mean square of the successive differences of peak-to-peak time intervals (RMSSD–T), and standard deviation of peak values (STD–PV). As a performance metric evaluating the proposed diversity method, the ratio of usable period is considered. Experimental results show that our diversity method increases the usable period 19.53% and 6.25% compared to the color intensity or the fingertip movement signals only, respectively.


Introduction
Heart rhythm has been used as a significant indicator in monitoring cardiovascular healthiness e.g. cardiac arrhythmias may indicate atrial fibrillation (AF) which is correlated to the risk of stroke and heart failure [1,2]. Slow or fast heart rate in addition to heart rhythm could also point out the heart healthiness. According to the World Health Organization (WHO), heart diseases, including heart attack, stroke, heart failure, and heart valve problems, are reported as the main causes of more than 30% of all deaths around the world [3]. Since heart diseases can be asymptomatic and intermittent, especially in early stages, detecting heart diseases has been a major challenge to clinicians. Therefore, a simple heart rhythm monitoring technique (that is readily available without requiring additional electrodes/sensors) is needed for outpatient use and daily activities [2,4]. As smartphones prevail around the world, and smartphone cardiovascular apps are developed and used to monitor users' health [1,5,6], the opportunity exists to provide the medical community with a quality smartphone cardiac monitoring technology. Video camera sensors embedded in smartphones enable acquiring heart rate (or pulse rate) from users' fingertips. For example, time series signals, called smartphone photoplethysmogram (PPG), are obtained from color intensity changes [7] of successive fingertip images taken by a smartphone video camera. These signals provide physiological information including oxygen saturation, heart rate, and respiratory rate [8][9][10]. Previous studies showed that the smartphone PPG signal could be used for AF determination and discriminating AF from premature atrial contractions (PACs), premature ventricular contractions (PVCs) and normal heart rhythm [1,5,6,11]. Moreover, the advent of highly sensitive image sensors in smartphone video cameras has enabled acquisition of fingertip movement caused by heart pumping [12].
The heart rate estimated from these smartphone signals is 90% accurate when they are acquired without motion. However, there are different factors that limit accurate measurements of heart rhythm and heart rate variation when using smartphones. Some of these factors are limited sampling rate of smartphones compared to clinical devices, heating problem of the flash light in long term measurement and the experimental artifacts induced during acquisition step [13][14][15]. Since the first two factors are related to the structure of the smartphones, filtering out the experimental artifacts [also called motion and noise artifact (MNA)] is relatively more essential to overcome than the other factors. Walking, running, hand movement, and tremor are some examples of experimental conditions that produce different experimental artifacts.
To overcome the MNA, different MNA detection/reduction approaches have been proposed: hardware-based and software-based. The hardware-based MNA detection approaches measure pure MNA signals from additional MNA-focused hardware, e.g. accelerometer [16,17] and use them to remove/reduce MNAs. However, these approaches require additional signals together with main physiological signals. Moreover, the hardware-based approaches may cause false positive MNA detections in that the physiological signal is clean but the hardware estimates that the signal is corrupted by MNAs. Differently, from the hardware-based approaches, the software-based approaches are based on signal processing techniques/algorithms. For example, blind source separation [18][19][20][21][22][23][24][25], time-or frequency-domain parameters [26,27], and adaptive filter techniques [28] are introduced to detect MNAs. A concept of signal quality index (SQI) is widely adopted in the MNA detection methods using time-or frequency-domain parameters since the SQI can effectively quantify the amount of MNAs in physiological signals [17,[28][29][30][31][32][33]. In addition, data fusion techniques have been introduced to reduce the effect of MNAs by exploiting the diversity from multiple sensors [34][35][36][37][38][39][40]. The data fusion is adopted for respiratory rate estimation from noisy signals measured with photoplethysmograph (PPG), impedance pneumograph (IP), arterial blood pressure (ABP) and peripheral arterial tonometry waveform (PAT) [36,37]. Different modulation sources are applied to extract the respiratory rate from a single lead ECG [39]. The heart rate, signal quality indices, and data fusion approach are adopted together to reduce the effect of false alarms in the intensive care unit (ICU) [38,40].
In this paper, we propose a novel diversity method which exploits the diversity gain to obtain reliable heart rate information, i.e. to increase the ratio of the clean usable segment to be used to calculate heart rate. We do this by selecting the better signal between the two signals (color intensity or fingertip movement) based on SQIs. As a result, the proposed method will provide more usable periods compared to the non-diversity method, e.g. the color intensity signal only. We consider two different types of heterogeneous smartphone signals obtained from a single smartphone camera recording: 1) color intensity signal [7], and 2) fingertip movement signal [12]. These two acquired signals are heterogeneous since they extract different information, i.e., the color intensity signal measures blood flow change on a fingertip while the fingertip movement signal measures the subtle movement of fingertip initiated by heart pumping. To exploit the diversity from these two heterogeneous smartphone signals, the proposed method 1) first divides the smartphone signals (color intensity and fingertip movement signals) into segments, and 2) then calculates the SQIs' values of each segment. 3) Then, for each time slot, the proposed method selects the better segment between the two segments (color intensity and fingertip movement segments).

Experimental procedure
The smartphone data and PPG data which are used in this paper are acquired under a protocol approved by the Institutional Review Board (IRB) (IRB#: IRB2016-764) at the Texas Tech University. We recruited 15 healthy subjects whose ages are in the range of 18 to 80. The recruited subjects were not diagnosed with cardiovascular problems. From the recruited subjects, smartphone signals were acquired using an iPhone X. Specifically, each subject was asked to sit on a chair in a room with ambient light and place his/her fingertip on a camera lens as shown in Fig 1A. When our developed smartphone app starts, the flashlight beside the lens is turned on automatically and the smartphone camera records images.
During the measurement procedure, our smartphone app displays the image of fingertip taken by smartphone camera as a red rectangle at the top of the screen shown in Fig 1B. These acquired images by smartphone camera are the source images for further analysis explained in detail in the subsection Signal Acquisition Step. As shown in Fig 1B the smartphone app also displays the acquired PPG and instantaneous heart rate. The total duration of the measurement procedure is 2 minutes. During the measurement procedure, the smartphone's camera and the lens are fully covered with the subject's fingertip (see Fig 1A); and at the same time placing his/her finger of the other hand inside the PPG clip sensor of the NeXus 10 mark-II (see Fig 1C) [41]. To induce the MNA in the smartphone signals, subjects are asked only to move the hand which holds the smartphone in a left-right or/and up-down direction. The total duration of the movements (up-down and left-right directions) is 30 seconds. During the movement phase, the smartphone's camera and the lens are still fully covered with their fingertip, while keeping the other hand with the NeXus PPG sensor in a stable position.
Our proposed diversity method for smartphone-based heart rhythm signals consists of 1) signal acquisition, 2) SQIs calculation, and 3) signal selection steps as shown in Fig 2. The detailed description of these three steps is presented in the following subsections.

Signal acquisition step
From a single smartphone video recording (see Subsection Experimental Procedure), two heterogeneous types of signals are acquired in the signal acquisition step: 1) color intensity signal [7], and 2) fingertip movement signal [12]. Here, the 3,600 images come from a 120-second recording time with a 30-frame per second (fps) sampling rate (30 (fps) x 120 (secs) = 3,600 frame images). Specifically, as an example, the four images, which are 201 st (red-rectangle), 211 th (purple-star), 267 th (blue-circle), and 296 th (black-triangle) images among the 3,600 images are shown in Fig Fig 3A) are used to calculate the fingertip movement signal. Both signals are one-dimensional time-series signals (see Fig 3D) obtained from two-dimensional successive images (see Fig 3A). Therefore, the four images shown in Fig Fig 3C. In this example, a smartphone recording consisting of 3,600 successive images results in a one-dimensional color intensity signal and a one-dimensional fingertip movement signal, each of which is composed of 3,600 successive points. The detailed procedure of calculating green channel values from the original images is described in subsection 'Color Intensity Signal' while the procedure of calculating sizes of ROIs is presented in subsection 'Fingertip Movement Signal'.
Color intensity signal. Color intensity signal is derived from source images as follows: 1) green channel image extraction, and 2) average color intensity calculation. Fig 4 shows the procedure of getting a point on a color intensity signal from a source image. Each source image (see Fig 4A) is represented by RGB 888 image format which consists of three color channels: red (R), green (G) and blue (B) (see Fig 4B). From each pixel in a source image, the green color intensity value among R, G, and B is extracted as shown in Fig 4C. Here, green channel  is chosen since the absorption of green light in the oxyhemoglobin is most sensitive among three colors [42]. As a result, a green channel image consisting of the green values at each pixel of the source image is shown in Fig 4D. The average value of these pixels on the green channel image (Fig 4D) are mapped into one point of a color intensity signal. This procedure is repeated for each successive source image.
Fingertip movement signal. Fingertip movement signal is obtained from source images by the following steps: 1) bit rearrangement, 2) edge detection, 3) smoothing, 4) binarization, are directly calculated from the four average intensity values of green color in Fig   and 5) ROI calculation. Fig 5 shows the bit rearrangement step which is applied to every pixel of a source image. Using a common method, a source image which is represented by RGB888 image format of three bytes (see Fig 5A) is reduced into RGB565 image format of two bytes (see Fig 5B), and the RGB565 image is rearranged as shown in Fig 5C. The bit rearrangement process is applied to the original image shown in Fig 6A. After bit rearrangement, in each color, the locations of the bits near the most significant bit (MSB) are exchanged with the locations of the bits near the least significant bit (LSB). This process enhances major changes coming from MSB, making variations visually more apparent. After this bit rearrangement procedure (see Fig 6B), images are converted into grayscale images. On grayscale images, edge detection, smoothing, and binarization steps are performed sequentially. In the edge detection step, an edge is detected using the differential operator method (see Fig 6C), where edges are a set of points having larger differential value than a pre-defined threshold. Using distance transform and anisotropic diffusion [43,44], the edges are smoothed by removing the discontinuity around the edges in the smoothing step (see Fig 6D). Then, the smoothed edges are binarized in the binarization step (see Fig 6E).
The ROI is detected from the binarized image (see Fig 6F). The ROI calculation step is as follows: the curve closest to the center of image (see red dot) is chosen as shown in Fig 7A and  7B, and ROI is the area under the curve which are the white areas in Fig 7A and 7B. The size of the ROI is calculated in each image, and the calculated value is mapped into one point in the fingertip movement signal.

SQIs calculation step
Both of the color intensity signal and fingertip movement signal are preprocessed by a high pass filter with a cutoff frequency of 0.5 Hz to focus on MNAs in calculating SQIs. The smoothing algorithm is applied to the output of the high pass filtered signal to facilitate the acquisition of the heart rate by getting rid of small fluctuations in the signal. The following SQIs are considered to quantify signal quality in this paper. 1) standard deviation of instantaneous heart rate (STD-HR), 2) root mean square of the successive differences of peak-topeak time intervals (RMSSD-T), and 3) standard deviation of peak values (STD-PV) [11,33,45,46].
In the SQIs calculation step, the proposed method first divides the signal into multiple segments, calculates SQIs from each segment, and decides whether the segment is corrupted or not. Here, we set the length of segment into 5 seconds, which is determined in a sub-optimal way by grid search algorithm [47].  segment and the number of peaks in the i th segment, respectively, each SQI is calculated as follows: Standard deviation of instantaneous heart rate (STD-HR). The standard deviation of instantaneous heart rate STD-HR i at the i th segment is calculated as: STDÀ HR i ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where HR i is the average value of heart rate at the i th segment and HR i,k is calculated as: Since heart rate remains stable when a subject is stationary, clean segments are expected to have small STD-HR values. On the other hand, MNA-corrupted segments are expected to have larger STD-HR values due to irregular peaks caused by MNA.
Root mean square of the successive differences of peak-to-peak time intervals (RMSSD-T). The root mean square of the successive differences of peak-to-peak time intervals RMSSD-T at the i th segment is calculated as: RMSSDÀ T i ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Since the peaks are irregular in MNA-corrupted segments, while the peaks are regular in clean signals; the RMSSD-T values of MNA-corrupted segments are expected to be larger compared to those of the clean segments.
where PV i,k is the k th peak value of the i th segment and PV i is the average peak value of the i th segment. The STD-PV values of the MNA-corrupted segments are expected to be larger than those of clean segments since the amplitudes of signals may change due to MNAs while the amplitudes remain stable for the clean ones.
Calculation of the STD-HR, RMSSD-T, and STD-PV requires peak detection as shown in the above equations. Here, a simple peak detection algorithm is applied to the smartphone signals. Specifically, the simple peak detection finds the peaks in the following way: 1) find a series of local maximum points where a local maximum point is defined to be a point having the larger value than its two adjacent neighboring values. 2) Among a series of local maximum points, choose the prominent peaks as output by screening out the local maximum points which have smaller difference compared to the other adjacent local maximum points. The same peak detection algorithm is applied to color intensity and fingertip movement signals. Fig 8A shows an example of simulated sinusoidal signal without noise (30s-40s) and with noise (40s-50s) period. Here, the noise is added by the simulated additive white Gaussian noise (AWGN) with SNR of -5dB. Fig 8B, 8C and 8D show corresponding SQIs' values of the signal in Fig 8A. The STD-HR, RMSSD-T, and STD-PV of the corrupted parts are shown to be larger than those of the clean parts since there exist undesired peaks during the noisy period, which makes the SQIs increase compared to the clean period ( Fig 8A).

Proposed diversity method
In the MNA detection and diversity steps, the proposed method performs on each segment, where the segment size is 150 samples (5 seconds). Specifically, the proposed method 1) first detects MNA based on SQIs' values on each segment, 2) discards the segment if both the color intensity and fingertip movement signals are detected to be corrupted, and 3) selects the better signal based on the signal quality if either of two signals is detected to be clean. We first explain the MNA detection method, and then the signal selection procedure of our diversity method. Fig 9 shows the detailed procedure of our proposed diversity method.
MNA decision method. We adopt a concept of support vector machine (SVM) to find the decision boundary between clean and MNA-corrupted segments. Since SVM belongs to supervised machine learning techniques, the decision boundary between different categories is determined not in a predefined way but in an automatic way with a training set. Calculated STD-HR, RMSSD-T, and STD-PV values in each segment are used as input of SVM.
Major votes for enhancing MNA detection. We obtain two different SVM boundaries for the color intensity and the fingertip movement signals, separately. The decisions estimated by this SVM model is enhanced by a concept of major vote. Specifically, the proposed method applies a concept of major vote method [33] on SVM decision results, i.e., the final MNA decision on a segment is determined by the MNA decision of the neighboring segments as well as the MNA decision of the segment. For example, even though a segment is classified as "clean", the final decision on the segment is "corrupted" if the neighboring segments are classified as "corrupted". Specifically, the i th segment is clean while its neighboring segments, i.e., (i + 1) th and (i − 1) th segments, are MNA-corrupted, then the i th segment is marked as MNA-corrupted. The authors proposed that this major vote concept needs to be applied right after the MNA decision procedure since if there exist short and intermittent bursts of clean periods during the MNA-corrupted phase, then the inaccurate heart rate information is obtained during this MNA-corrupted phase. This inaccurate heart rate information may raise the hysteresis, and also causes unnecessary computational burden, e.g., calculating the heart rate from that short intermittent segment. On the other hand, corrupted bursts existing between clean segments can give rise to the wrong information if it becomes clean by the major votes. Hence, the major vote algorithm is applied to only the clean bursts existing between corrupted segments.
Diversity method. The diversity method chooses between the color intensity and the fingertip movement segments based on these enhanced decision results. The detailed procedure on each i th segment is as follows. Here, the evaluation function of SQI parameters is denoted by f(�), which is the summation value of all the three SQI parameters in each segment. The SQI value of the i th segment of the color intensity signal is denoted by SQI i,color , and the SQI value of the i th segment of the fingertip movement signal is denoted by SQI i,movement .
1. If both signals are estimated to be corrupted, the segment is rejected.
2. If one signal is estimated to be clean while the other signal is estimated to be corrupted, then the clean signal is selected.
3. If both signals are estimated to be clean, then the signal which has the lower SQI value at the i th segment is selected. That is, if f(STD-HR i,color , RMSSD-T i,color , STD-PV i,color )<f (STD-HR i,movement , RMSSD-T i,movement , STD-PV i,movement ), then the color intensity signal is selected. Otherwise, the fingertip movement signal is selected.

Results
We evaluate the performance of the proposed diversity method for smartphone signals in terms of MNA detection accuracy as well as the ratio of usable (or clean) period. We compare the proposed diversity method to the non-diversity method, i.e., only from the color intensity signal or only from the fingertip movement signal. The p-values between SQIs' values of clean and corrupted segments are presented in Table 1. The results show that for all three SQIs the p-values are less than 0.05 (p < 0.05). This indicates that the SQI parameters of both signals are significantly different between the clean and corrupted segments.

Performance evaluation of MNA detection and diversity effect
To validate the accuracy of the proposed SQIs in discriminating whether the segment is usable or not, we have the following annotation procedure. Annotations are performed segment-bysegment. Denoted by avgHR i,mea and avgHR i,ref the average heart rate values for the i th segment of smartphone and NeXus 10 mark-II signals, respectively, the i th segment of the smartphone signal is annotated as corrupted if |avgHR i,mea − avgHR i,ref | � TH diffHR . Otherwise, the i th segment of the smartphone signal is annotated as clean. Here the TH diffHR is defined as 8 beats per minute (bpm). This threshold is defined based on the sampling rate of the smartphone and the NeXus device. The sampling rate of the smartphone is 30Hz while the sampling rate of the NeXus device is 32Hz. The maximum value of normal heart rate was considered as 120bpm in which one sample error in the signal would cause changes of 8bpm in the heart rate.
We adopt support vector machine (SVM) to get a decision boundary between clean and MNA classes. For the fingertip movement signal, the total number of segments is 359 among which the numbers of clean and corrupted segments are 255 and 104, respectively. For the color intensity signal, on the other hand, the total number of segments is 359 which consists of 286 clean segments and 73 corrupted segments. The ratio between the clean and corrupted segments is 70% to 30% for fingertip movement signal while the ratio is 79% to 21% for color  intensity signal. The 5-fold validation is adopted in training and testing stages. During the training stage, SQIs' values of clean and corrupted segments are used as input training data and the corresponding annotations are used as labels for the input training data. During the test stage, the SQIs' values of the unknown segments are used as input test data, and the accuracy is calculated using the following equation by comparing the proposed SQI's estimation on the segment by the SVM to the corresponding annotations: where, N tp , N tn , N fp , and N fn are the number of true positive, true negative, false positive and false negative segments, respectively. On the other hand, the proposed diversity detection method is evaluated in terms of usable period ratio which is defined as: We evaluate the performance of the proposed diversity method by comparing usable period ratio of the proposed diversity method to that of the non-diversity methods.
MNA detection. To evaluate the MNA detection performance, our collected smartphone signals are segmented and annotated by NeXus 10 mark-II signals. Four types of annotations are assigned to each segment: 1) Red (R, color intensity signal is clean but fingertip movement signal is corrupted), 2) Blue (B, fingertip movement signal is clean but color intensity signal is corrupted), 3) Green (G ¼ R [ B either of signals is clean), and 4) Black (BL, both signals are corrupted). Here, G and BL are used for the reference while R, B, and BL are used for the decisions made by the SVM.
Specifically, it is True Positive if the reference is G (= R [ B) and the decision made by the SVM is R or B (= R [ B). On the other hand, it is True Negative if the annotation given to the time segment is BL and the output of our proposed method is found to be BL. It is False Positive if at least one of the signals were annotated as G (= R [ B) while the method estimates it to be BL. Finally, it is False Negative if both of the signals are annotated to be BL while the method estimates it to be either R or B. Fig 13A shows the color intensity and the fingertip movement signals acquired from a single smartphone video recording. Fig 13B shows the decision about clean/noisy parts of the signals made by annotation procedure with the NeXus 10 mark-II signals while Fig 13C shows estimation results of our proposed MNA detection method. Comparing Fig 13B with Fig 13C, the proposed MNA detection method is shown to give highly accurate estimation. Table 2 shows the accuracy of our proposed method. The proposed method shows MNA detection accuracy of 93.0% for the color intensity and 93.3% for the fingertip movement.
Diversity effect. Usable period ratio (or clean period ratio) of our proposed diversity method is compared to those of the color intensity and the fingertip movement signals in Table 3. The usable period ratio of the proposed diversity method is shown to be 85.23% while those of the color intensity and fingertip movement signals are shown to be 80.22% and  71.30%, respectively. This result shows that our proposed diversity method increases the portion of usable period by 6.25% and 19.53% compared to the color intensity only and the fingertip movement only signals, respectively.

Discussion
In this paper, a diversity method for two heart rhythm signals,-which are respectively obtained by assessing the color intensity and the fingertip movement signals from a single smartphone camera recording-is proposed to reliably and continuously get heart rhythm information in the presence of MNAs. To achieve this, our proposed diversity method 1) acquires two different types of smartphone signals, 2) quantifies the respective amount of MNAs in two heterogeneous signals based on the proposed SQIs' values on a segment basis, and finally 3) exploits diversity from the MNA detection results of two signals on a segment basis.
One of the advantages of the proposed method is in the signal acquisition step. That is, the computational complexity is not increased in getting two heterogeneous signals since it is obtained from a single smartphone recording. Hence, it does not require additional signal acquisition procedure. The other advantage of the proposed method comes from the diversity gain in the usable period ratio (or clean period ratio), which is compared to the conventional method, i.e. the color intensity or the fingertip movement signal.
We have evaluated our proposed method by applying it to both MNA-free and MNA-corrupted smartphone signals acquired from 15 healthy subjects. The experimental results have shown that the proposed SQIs' values are significantly different between MNA-free and MNAcorrupted signals. Specifically, the paired t-test was performed to determine whether there is significant difference (p < 0.05 at 95% confidence interval) between the SQIs' values signal obtained from MNA-free signals and MNA-corrupted signals. Especially, we adopted SVM to set the boundary classifying MNA-clean and MNA-corrupted segments. As input of the SVM, three SQIs are considered: 1) standard deviation of instantaneous heart rate (STD-HR), 2) root mean square of the successive differences of peak-to-peak time intervals (RMSSD-T), and 3) standard deviation of peak values (STD-PV). We compared the MNA detection performance in our proposed method to the other MNA detection techniques [48,49] which used RMSSD-T parameter only and STD-PV parameter only to detect MNA. Table 4 shows the MNA detection accuracies of the proposed method, the RMSSD-T only [48], and the STD-PV only method [48,49]. As shown in Table 4, the accuracy of the proposed method is around 93% for both the color intensity and the fingertip movement signals while the RMSSD-T only method in [48] gives 78% and 70% accuracies for the color intensity and the fingertip movement signals, respectively. The accuracy for the STD-PV only method [48,49] is 81.9% and 71.6% for the color intensity and the fingertip movement signals, respectively. As a result, our method performs better MNA detection than RMSSD-T only or STD-PV only methods in [48,49] do. The experimental results also have shown that our proposed diversity method with these MNA detection results provides 6.25% and 19.53% higher usable clean periods compared to the conventional color intensity-only or fingertip movement-only signals. The proposed method in this paper is expected to be useful for getting continuous physiological information using different types of or multiple signals from a smartphone, including heart rhythm information, in the presence of motions or noise artifacts.
Especially, we adopted the SVM to set the boundary classifying MNA-clean and MNA-corrupted segments. In the SVM classifier, the ratio between the clean and corrupted segments were 70% to 30% for fingertip movement signal and 79% to 21% for color intensity signal. To study the effect of imbalanced data, we adopted the synthetic minority over-sampling technique (SMOTE) [50] to increase the number of corrupted samples (minority class) and make it the same size as the number of clean segments. The SMOTE technique creates synthetic data by using n number of nearest neighbors of the features. This technique maps a sample data with the dimension of (S,f) to the new data with the dimension of (S',f) where S is the original sample size of data, S' is the size of oversampled data and f is the size of the feature vectors. First, n nearest neighbors of a sample are selected randomly. The difference of the feature vector of the sample with the n nearest neighbors are derived. These feature vectors are multiplied with a random value between 0 and 1 to create the final oversampled samples [50]. The accuracy of the MNA detection method after adopting the SMOTE approach is 90.0% for color intensity and 92.5% for fingertip movement. These values are 93.0% for color intensity and 93.3% for fingertip movement without applying the SMOTE. The difference between the accuracy from the SMOTE approach and without the SMOTE approach is observed to be less than 5%. Fig 14 shows the SVM decision boundary for both fingertip movement and color intensity signals before and after adopting the SMOTE technique. Fig 14A is two-dimensional representation of the SVM boundary and support vectors when the SMOTE technique is applied. On the other hand, two-dimensional representation of the SVM boundary without the SMOTE is shown in Fig 14B. The SVM boundary decision for the color intensity signal after and before applying the SMOTE technique is shown in Fig 14C and 14D. As shown in Fig 14A and 14C, by adopting the SMOTE technique the number of corrupted segments (green star) is increased compared to those in Fig 14B and 14D while the number of clean segments remains the same. However, after adopting the SVM, there is not too much difference between the selected samples as support vectors in both cases as shown in Fig 14. Moreover, the main boundary line positions have not changed after applying the SMOTE. The results indicate that SVM is robust to the imbalanced data in this example.
Moreover, we performed the diversity method on the results of the MNA detection method with the SMOTE approach. The ratio of usable segments after applying the SMOTE technique is 72.71% and without the SMOTE technique is 80.22% for the color intensity. For the fingertip movement signal, the ratio of the usable segments is 67.96% and 71.30% without the SMOTE. Therefore, the ratio of usable segments is higher for each type of signals without the SMOTE technique. With the diversity approach, the enhanced decision results enable us to select between the color intensity and the fingertip movement signals based on the quality of the signal (clean or corrupted). As a result, the ratio of usable clean segments is 81.89% after adopting the SMOTE technique while it is 85.23% without the SMOTE. The results of the diversity method provides12.6% and 20.5% increase in the ratio of usable clean periods compared to the conventional color intensity-only or fingertip movement-only signals after adopting the SMOTE. Although the values of increment are slightly higher compared to the increment ratio when the SMOTE is not applied (6.25% for color intensity and 19.53% for fingertip movement signal), the ratio of usable clean segment is less after applying the SMOTE for each of the color intensity, fingertip movement and diversity method.