Fig 1.
Calibration curve for the hot wire anemometer.
A fourth order least square fit of the experimental data (shown as maroon dotted line) becomes the calibration curve for the hot wire anemometer in use. The polynomial equation of the fourth order fit is shown inside the plot.
Fig 2.
Experimental setup and recorded time series.
(A) Depiction of the experimental setup for data collection. It consists of a disposable mouth-piece, a mouth-piece mount housing a hot wire anemometer and a data acquisition system. (B) A typical human exhalation velocity signal measured using a standard hot wire anemometer. The time signals were sampled at 10kHz for 1.5 seconds.
Fig 3.
Multifractal spectra for different segments of a time signal.
The multifractal spectra corresponding to the entire time signal (maroon) and time segments X, Y and Z (black, bounded by gray band) in (A) are shown in (B). It is evident that few segments exhibit an inverted parabola shape and spectrum B has a distortion.
Fig 4.
Plot of the spectrum of singularities f(α) against the singularity strength α, computed for an exhalation time series segment. The parameters β, ω and ϵ are the features that characterize a multifractal spectrum.
Fig 5.
Flow chart showing the algorithm pipeline, including time series normalization, filtering, feature extraction, feature reduction, and data splitting into training and testing. The time signal shown here is one of the segments of the original time series. Note that the representation of blue bar for training dataset and green bar for testing dataset will be consistent in further discussions in this manuscript. The training data of all users were used for building nC2 binary classifier models, which becomes the process known as enrollment.
Fig 6.
User confirmation algorithm based on hypothesis testing.
A flow chart of the user confirmation algorithm based on hypothesis testing. The user confirmation block will be made use in the user identification algorithm later in this manuscript. An example of the hypothesis test against user-pair is illustrated inside the dotted box, directed from the user confirmation block by the red asterisk. Given a user i, the user confirmation block’s output was reposed to answer the question “Are you indeed User i?” based on a threshold.
Fig 7.
User confirmation algorithm based on machine learning.
A flow chart of the user confirmation algorithm based on machine learning. The user confirmation block will be made use in the user identification algorithm later in this manuscript. Given a user i, the user confirmation block’s output was reposed to answer the question “Are you indeed User i?” based on a threshold.
Fig 8.
A generic user identification algorithm.
Given a test user j, the algorithm performs n confirmation trials. One confirmation trial is the equivalent to running the user confirmation block (either HT from Fig 6 or ML from Fig 7) for a trial user i. The identified user corresponds to the maximum prediction based on the n confirmation tests. Note that in the case where more than one confirmation trial results in the maximum prediction value, the algorithm does not identify a user.
Fig 9.
Comparison of the confidence of confirmation ηi.
Histograms of confidence of confirmation ηi compared between (A) a machine learning based approach (random forest classifiers) and (B) a hypothesis testing based classification approach, for one trial of n confirmation tests. In the example shown here, the predictions from ML classifiers give a range of ηi values distributed between ≈38% to 100%, whereas the predictions from HT based classifiers produce ηi values only close to 0% and 100%.
Fig 10.
Comparison of the decision boundaries in (β, ω) plane.
Decision boundaries captured by (A) random forest classifier and (B) hypothesis testing based classifier for a randomly chosen user-pair. The scattered points are the training data points with red and blue labels denoting their true classes respectively. The line separating the two contour regions is the decision boundary. Accuracy of each model against the test data is displayed at the top right corner of their respective plots. The RF classifier captures a complex decision boundary compared to the HT based classifier.
Fig 11.
Dependence of user identification time on the size of model library.
Plot showing the linear relation of user identification time with the growth of model library. This is applicable to the ML based algorithms which include building of binary classifier models (also known as enrollment in the context of biometrics). The error bars show 95% confidence interval at every data point.