Figures
Abstract
During the real-time recognition of porcine abnormal sounds, the accuracy and stability of the recognition method are crucial to guarantee a good performance. For this purpose, an improved Multiple-Support Vector Data Description (Multi-SVDD) is proposed in this paper. Firstly, the improved spectral subtraction using improved Minima Controlled Recursive Averaging (IMCRA) and Spectral Subtraction (SS) is applied to remove the noise of collected sounds. Then, the Mel-Frequency Cepstral Coefficients (MFCC) and first-order differential MFCC (ΔMFCC) are extracted as feature parameters. Finally, the Multi-SVDD is used to detect and recognize the porcine abnormal sounds. In order to improve the accuracy and error-tolerance of Multi-SVDD for human errors on tagging data, the space density information of training data is calculated as the confidences to reduce the interference of outliers in the process of Multi-SVDD training. The experimental results show that the accuracy, precision and recall of the proposed method are as high as 95.0%, 95.4% and 95.0% respectively, which indicates higher error-tolerance capability than classical SVDD.
Citation: Zhang S, Jia B, Gao Y (2025) A novel approach to porcine abnormal sounds recognition based on improved Multi-SVDD. PLoS One 20(9): e0332996. https://doi.org/10.1371/journal.pone.0332996
Editor: Dandan Peng,, The Hong Kong Polytechnic University, CHINA
Received: July 1, 2025; Accepted: September 8, 2025; Published: September 29, 2025
Copyright: © 2025 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data used in this paper are all from public datasets: https://www.kaggle.com/datasets/titpigrecognition/porcine-scream-sounds-and-cough-sounds/data.
Funding: This research work was funded by the Fundamental Research Program of Shanxi Province (Grant No. 202303021222303), Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi (Grant No. 2023L362) and Taiyuan Institute of Technology Science Research Initial Funding (Grant No. 2022LJ021).
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
The sound of animals is the most common bio-signal that can be collected easily from a distance, and will not cause any additional stress to the animals [1]. Therefore, sound analysis has huge potential in interpreting the behavior, health condition, and well-being of animals [2]. Recently, it has been shown that sound recognition played an important role in speech recognition [3], emotion recognition [4], bio-acoustical techniques [5] and among others.
With the increasing use of wireless sensor network technology [6], sound analysis has been widely studied in both wild animals and farm animals. Sound analysis technology used in recognizing the sound of wild animals, such as bird [7], frog [8], anuran [9], has shown good performance. In order to meet the increasing global demand for livestock products, livestock management practices have shifted towards the intensive breeding method [10]. Precision livestock farming (PLF) is a trending area in livestock management. Sound analysis has a huge potential for PLF in monitoring the health status of animals [11]. In recent years, sound analysis technology has been used extensively to monitor various kinds of farm animals, like chicken [12], cattle [13], sheep [14] and pig [15] in particular. Many of these research projects have focused on sound analysis of pigs. Chung [2] proposed a pig wasting disease detection and recognition system to detect and classify different kinds of cough sounds due to three types of pig wasting diseases. Cordiero et al. [16] estimated the level of pain in piglets by using a Decision Tree (DT). Xipeng Wang et al. [17] proposed a continuous cough automatic detection method to detect single cough and continuous cough in a complex piggery environment.. A voice activity detection (VAD) method is proposed to automatically segment continuous sound. A multi-classifier fusion strategy is investigated to promote recognition accuracy. Weihao Pan et al. [18] used deep neural network (DNN) and Hidden Markov Model (HMM) theory to recognize pig sound signal. The collected sounds were preprocessed by Kalman filtering and an improved endpoint detection algorithm based on empirical mode decomposition-Teiger energy operator (EMD-TEO) cepstral distance. The 39-dimensional mel-frequency cepstral coefficients (MFCCs) were extracted as characteristic parameters.
Although previous papers have achieved high recognition accuracy, there are still some shortcomings. The acoustic environment in a real pigpen is significantly more complex. In addition to porcine abnormal sounds, there are many other kinds of sounds in the pen. There is a serious imbalance in the number of these kinds of sounds. Consequently, annotating these additional sound types is both time-consuming and challenging, making the accurate detection of porcine abnormal sounds from the entire set of collected sounds particularly difficult. Moreover, recognizing the collected sounds by classification algorithm may classify the other kinds of sounds as porcine abnormal sounds. When training data are incorrectly tagged, the recognition accuracy will be adversely influenced. The recognition method requires high accuracy and stabilization. SVDD is a widely utilized One-Class Classification (OCC) method designed to classify positive cases without well-defined negative cases. SVDD is widely applied in fields such as fault detection [19] and anomaly identification [20]. In this paper, a method based on improved Multi-SVDD is proposed to improve the accuracy and stability during the real-time recognition of porcine abnormal sounds. The valid estimated noise extracted by traditional spectral subtraction may be deficient during the porcine sounds denoising. An improved spectral subtraction using IMCRA and SS is presented to improve the denoising performance in reprocessing. After extracting the MFCC and ΔMFCC as feature parameters, the Multi-SVDD is used to recognize sounds such as porcine cough and scream. In order to improve the error-tolerance of Multi-SVDD for human errors on tagging training data, the space density information of training data are calculated as the confidences to reduce the interference of outliers in the process of Multi-SVDD training. The experimental results show that the accuracy, precision and recall of the proposed method reach 95.0%, 95.4% and 95.0% respectively. When the training data has tag errors, the proposed method shows higher error-tolerance capability compared to classical SVDD. Meanwhile, the method can be extended to sound recognition of other animal species as well as anomaly detection across various domains, thereby demonstrating broad application potential.
The paper is organized as follows: Section 2 describes the experimental setup and the automatic recognition method based on the improved Multi-SVDD. Section 3 presents the experimental results and comparison between the traditional method and improved method. Section 4 concludes this paper.
2 Materials and methods
2.1 Materials
In this paper, the experimental data were collected from a large-scale pig farm located in Shanxi Province, China. The sounds were collected through an acoustic pickup device (ELITE model OS-100 made in China). The sampling frequency is 8kHz. For recognizing suspected abnormal pigs, cough and scream of pigs were selected as abnormal sounds. Cough [21] is an early symptom of respiratory diseases, such as asthma and bronchitis. Screams [22] are stress reaction of pigs when they are suddenly hurt. The waveforms and spectrograms of porcine cough and screams are shown in Fig 1 and Fig 2.
In the spectrograms, color is used to represent amplitude. Specifically, brighter colors indicate higher amplitudes, while darker colors correspond to lower amplitudes. It can be seen that there is a difference in the waveforms and spectrograms between cough and screams. Therefore, the cough and screams of pig can be recognized by sound recognition method.
2.2 Methods
The proposed method in this paper consists of three stages: preprocessing of porcine sounds, features extraction and abnormal sounds recognition. (1) Sounds preprocessing is composed of activity detection, noise removal, endpoint detection and windowing. (2) The MFCC and ΔMFCC are used for features extraction. (3) An improved Multi-SVDD is proposed to recognize the porcine abnormal sounds in the third stage.
2.2.1 Sound preprocessing.
- (1). Activity Detection
The sounds collected in real-time also include noise from the background, which are low in activity level and will affect the accuracy of the sound analysis. In order to eliminate these irrelevant background noise from the collected sounds, the sound energy is calculated, which is defined as [23]:
where E is the sound energy; x(n) is the collected sound and Len is the length of the collected sound.
By setting a threshold value, the background sounds, which has sound energy lower than the threshold, can be excluded from the detection and recognition process. To find an optimum threshold value, the average energies of the two kinds of abnormal sounds and the average energy of ambient noise without porcine sounds are calculated and shown in Fig 3.
It can be seen that the energies of porcine cough and scream with the length of 1s are above 200 and the energy of the sound collected in quiet environment is less than 40. Therefore, the energy threshold is set as 40 in this paper.
- (2). Noise Removal
During the process of sound collection and transmission, porcine sounds may be contaminated with noise. The recognition result can be influenced by noise. The background noise in the pigpen is mainly the sound of air blowers. The interference noise is introduced in sound transmission from the pickup to the industrial computer. These noises are all additive noises or stationary sounds. Spectral subtraction is applicable to eliminate the stationary and additive noise [24]. The porcine sound with noise y(n) can be written as:
where x(n) is the porcine sound without noise; d(n) is the noise and N is the length of the sound.
Fast Fourier Transform (FFT) is carried out for the sounds after framing which segment the sound into frames, and the following equations are obtained:
where l is the index of frame; k is the index of frequency bin.
The power spectrum of both sides of the equation can be obtained as follows:
where is the power spectrum of y(n);
is the power spectrum of x(n);
is the power spectrum of d(n); * is the complex conjugate.
Since x(n) and d(n) are independent of each other,
The power spectrum can be expressed as follows:
Therefore, the denoised sound can be calculated as,
where is the estimated power spectrum of noise.
During denoising of traditional SS, if the estimated power spectrum of noise is different from the actual noise, ‘music noise’ may be generated. In order to suppress the ‘music noise’, the denoised sound can be calculated as follows [25]:
where is reduction factor and
is gain compensation factor.
The estimated noise is acquired by extracting ‘non-sound frame’. The duration of the collected porcine sound in real time is 1s. Therefore, it is hard to estimate the noise accurately. In this study, Improved Minima Controlled Recursive Averaging (IMCRA) is applied to estimate the noise. The de-noise processing is shown in Fig 4.
IMCRA is a noise estimation method which tracks the noise region by the estimated sound presence probability. The noise is estimated by recursively averaging past spectral power values of the noisy measurement [26]. Under porcine sound presence uncertainty, the recursive averaging is calculated by the conditional sound presence probability, which is shown to be [27]:
where is a time-varying frequency-dependent smoothing parameter. It can be obtained by
where is a smoothing parameter; p(l,k) is the conditional sound presence probability. The noise estimation
is given by
where is a bias compensation factor. In this paper,
The sound presence probability p(l,k) is estimated by two iterations of smoothing and minimum tracking. The IMCRA shows great performance in estimating noise [28]. Therefore, the power spectrum of noise is estimated by IMCRA to improve the de-noising performance of SS.
- (3). Endpoint Detection
In order to find the valid part of the collected sound and reduce the interference of invalid parts, double-threshold endpoint detection is used to detect the start point and the end point. The double-threshold can determine the start point and the end point of the collected sounds through short-time average zero-crossing rate and short-time average energy. The short-time average energy is defined as:
where L is frame length; fn is the number of frames; yi(n) is the ith frame of collected sound, which is expressed as:
where inc is the length of frame shift; w(n) is window function, hamming window is selected as window function.
The short-time average zero-crossing rate is defined as:
where inc is the length of frame shift; w(n) is window function, hamming window is selected as window function.
The short-time average zero-crossing rate is defined as:
Taking porcine cough as an example, the starting point and ending point are detected by double-threshold endpoint detection method. The detection result is shown in Fig 5.
In Fig 4, the solid line on the left is the starting point, and the dotted line on the right is the end point of the porcine cough. It can be seen that the starting and ending points of porcine cough can be detected more accurately through the double-threshold endpoint detection method.
- (4). Windowing
After endpoint detection, a hamming window is applied to segment the sound into overlapping frames with fixed length [8]. Because the Hamming window has a narrow main lobe and it can reduce the influence of side lobe, the continuity between the porcine sound can be maintained in each frame. The hamming window is defined as:
where L is the length of the frame.
2.2.2 Feature extraction.
Mel-Frequency Cepstral Coefficients (MFCC) proposed by Davis and Mermelstein, is designed to mimic human auditory response based on the relationship between actual and perceived frequencies [10]. Therefore, it could be a better representation of sound [29]. The relationship between the Mel-frequency and frequency is given as:
where fmel is the Mel-frequency of porcine sounds; f is the frequency of porcine sounds.
The extraction steps of MFCC of porcine sound are as follows:
Step1: Preprocess the collected sound, including activity detection, noise removal, end-point detection and windowing.
Step2: FFT is performed on the preprocessed porcine sound, which is expressed as:
where xi(m) is the ith frame of collected sound; X(i,k) is the spectrum of ith frame; k is the spectral line number.
The power spectrum E(i,k) is calculated by the spectrum of collected sound, which is defined as:
Step3: The power spectrum E(i,k) of collected sound is filtered through a set of Mel filters, and the energy of the power spectrum of porcine sound signal in the Mel filter bank is obtained as:
where S(i,m) is the energy of Mel filters; m is the serial number of Mel filter, m = 0,1,…,M-1; M is the number of Mel filters; Hm(k) is the transfer function of Mel filter which is shown as:
Step 4: Logarithmic operation is performed on the energy of Mel filters S(i,m), as:
Step 5: The discrete cosine transform is performed on logarithmic energy of collected sound, which can be described as:
where mfcc(i, n) is the MFCC of collected sound; i is the serial number of frames; n is the serial number of MFCC.
MFCC only reflects the static characteristics of porcine sound, but the first-order difference of MFCC (Δ MFCC) can reflect the dynamic characteristics of the sound. In this paper, 12 dimension MFCC and 12 dimension ΔMFCC are extracted as porcine sound feature parameters.
2.2.3 The Recognition of Porcine Abnormal Sounds.
After extracting the feature parameters based on MFCC, the collected sounds are evaluated to see whether they belong to the two kinds of abnormal sounds.Then the collected sounds are classified into two types: cough and scream using Support Vector Data Description (SVDD). SVDD is an one-class classifier. Its basic idea is to construct the smallest sphere which contains all possible training data [30].
In order to detect and classify porcine cough and scream, Multiple-Support Vector Data Description (Multi-SVDD) is proposed in this paper, which is constructed based on two SVDDs. The structure of Multi-SVDD is shown in Fig 6.
In Fig 5, SVDD1 is the recognition model of porcine cough and SVDD2 is the recognition model of porcine scream. r1 is the radius of the first hypersphere; o1 is the center of the first hypersphere; and r2 is the radius of the second hypersphere; o2 is the center of the second hypersphere. Since the hypersphere is constructed based on the training data, it is easily influenced by the tag errors of training data. In order to improve the error-tolerance of Multi-SVDD for human errors on tagging training data, the space density information of training data is calculated as the confidences to reduce the interference of outliers in the process of Multi-SVDD training. The main steps of abnormal sound recognition by improved Multi-SVDD are as follows:
Step 1: The feature parameters of porcine abnormal sounds are defined as , where
(i = 1,2,‧‧‧, q); q is the number of training data; d is the dimension of feature parameters. In this paper, d = 24.
Step 2: Using subtractive clustering [31], the densities of a group of training data can be defined as:
where Pi is the density of xi; ra is the neighboring radius; q is the number of training data. In this paper, ra is expressed as follows:
The density center Pmax is given by the following:
Therefore, the confidences of the training data are defined as:
Step 3: Let denote the nonlinear mapping function. The process of nonlinear mapping can be expressed as
, where Fm is a high dimensional feature space. The smallest sphere containing all possible training data should be constructed in Fm. The confidences of the training data are introduced during the training process of SVDD. The optimization problem can be described as:
where r is the radius of hypersphere; o is the center of hypersphere; is slack variable; C is penalty factor.
Step 4: In order to solve the constraint problem of Eq. (28) and Eq. (29) and introduce Lagrangian multipliers, the formula can be reformulated as:
where and
are Lagrange coefficients and
,
, i = 1,2,…,q.
After solving the equation, the radius can be represented as follows:
where xk is support vector.
The kernel function is introduced to replace the inner product, which is defined as . The formula of radius can then be written as:
In this paper, radial basis function is selected as the kernel function, which is defined as:
where is kernel parameter.
Step 5: When generating a new test data z, the decision function for porcine abnormal sounds can be constructed as follows::
The decision results of improved Multi-SVDD are the combination of decision functions of SVDD1 and SVDD2. The recognition strategies of improved Multi-SVDD are shown in Table 1.
3 Results and discussion
In order to validate the effectiveness of the proposed methods, sounds of 1s duration are used during the course of experiment. After pre-processing, the results of feature extraction and recognition are analyzed in this section. All of the experiments are implemented using Matlab, where we run our algorithm and other experiments on a 8GB NVIDIA GeForce RTX 3070, 2.1GHz Intel Core i7-12700F CPU, and 32GB RAM.
3.1 Results of feature extraction
The feature parameters represent the information contained in a sample of porcine sound. Therefore, the same type of porcine sounds have the same feature parameters, while the sounds from different categories are different. After pre-processing, the MFCC and ΔMFCC extracted from porcine cough and porcine scream are shown in Fig 7 and Fig 8 respectively.
The dimensions of MFCC and ΔMFCC are both 12. As shown in Fig 6, the MFCC and ΔMFCC parameters extracted from different frames are basically the same for porcine cough. Similar result is observed for porcine scream as illustrated in Fig 7. However, distinct difference in MFCC and ΔMFCC between cough and screen is observed. Therefore, the MFCC and ΔMFCC are effective to differentiate and represent porcine cough and scream.
The common frequency domain features include MFCC, Linear Prediction Cepstral Coefficient (LPCC), Cochlear Filter Cepstral Coefficients (CFCC) among others. In order to validate the advantages of MFCC and ΔMFCC, recognition accuracy is used to compare the results of different feature parameters. The recognition model is Multi-SVDD and the test data are 200 (100 porcine cough sounds and 100 porcine scream sounds). The recognition results are shown in Table 2.
Table 2 shows that the recognition accuracies of MFCC and ΔMFCC are 91.00% and 94.00%. It achieves the highest recognition accuracies. Therefore, MFCC+ΔMFCC surpasses the other feature parameters for the recognition of porcine abnormal sounds. In this paper, MFCC+ΔMFCC is selected as the feature parameter of porcine abnormal sound.
3.2 Recognition results of porcine abnormal sounds
In order to evaluate the recognition result quantitatively, accuracy, precision and recall [32] are used as performance measurements.
where True positive is the number of target sounds which are correctly recognized; True negative is the number of other kinds of sounds which are correctly recognized; False positive is the number of other kinds of sounds recognized as target sounds; False negative is the number of target sounds recognized as other kinds of sounds.
In order to analyze the recognition performance of improved SVDD when the training data are incorrectly tagged, the contract experiment is set in this study. Taking porcine scream for example, the recognition results of SVDD and improved SVDD with different numbers of tag errors of training data are shown in Table 3. In the contract experiment, 200 porcine scream sounds are selected as training data which are more than 6000 frames after windowing. 100 porcine scream sounds and 200 other kinds of sounds (human voices and other kinds of porcine sounds) are used as test data.
Table 3 shows that the accuracy between SVDD and improved SVDD are the same when the number of tag errors is 0. As the number of tag errors grows, the advantages of improved SVDD are more obvious. The improved SVDD has higher error-tolerance capability than SVDD, which is more likely to misrecognize other kinds of sounds as porcine scream.
In order to intuitively compare the recognition results between SVDD and improved SVDD with different numbers of tag errors of training data, the histograms are shown in Fig 9.
The improved Multi-SVDD consists of two improved SVDD. In order to test the performance of improved Multi-SVDD, the comparison experiment is set in this paper. In the experiment, 200 porcine cough sounds and 200 porcine scream sounds are used as training data, 200 porcine abnormal sounds (100 per kind of porcine abnormal sounds) and 100 other kinds of sounds are used as test data.
Before the training of improved Multi-SVDD, the penalty factor C and kernel parameter σ need to be specified. The Particle Swarm Optimization (PSO) is used to optimize these two parameters. In this paper, swarm size is 60, maximum iterations is 200, acceleration coefficients are 1.5 and 2.0, inertia factor is 1.0. The negative average recognition accuracy for recognizing porcine abnormal sounds is used as the fitness function, determined through ten-fold cross-validation. The fitness function of each particle is defined as:
where fitness is fitness function; k is the number of folds in cross-validation, in this paper, k is 10; em is the negative average recognition accuracy of the mth result, which is defined as follows:
where Nm is the number of correctly classified samples in the mth result; N is the total number of training samples.
The optimized penalty factor of SVDD1 is 0.9 and the optimized penalty factor of SVDD2 is 0.25. The optimized kernel parameter of SVDD1 is 1.00 and the optimized kernel parameter of SVDD2 is 7.55. The sound is separated into many frames after pre-processing. The recognition result of the sound is a list which contains the recognition results of all the frames by the improved Multi-SVDD. The majority voting algorithm [33] is employed to find the most elements as the final result. Meanwhile, in order to illustrate the effectiveness of the proposed method, it is compared with the other classification methods [17,23,29]. In the contrast experiment, sounds preprocessing and features extraction of these two methods are the same as this paper. The recognition results are shown in Table 4. The reference feature parameters of porcine abnormal sounds are determined by Fuzzy C-means (FCM) and the test data are classified by comparing with the reference feature parameters in [17]. The abnormal sounds of pigs are detected by SVDD, and then the abnormal sounds are classified as porcine cough and scream by Back-propagation Neural Network (BPNN) in [23]. The classification method of porcine abnormal sounds in [29] is Support Vector Machine (SVM).
It can be seen that the improved Multi-SVDD can recognize porcine abnormal sounds more accurately compared to other methods. The reference feature parameters determined by FCM are difficult to completely characterize the porcine abnormal sounds, which leads to a large detection error. SVDD+BPNN and SVDD+SVM are more likely to misidentify other kinds of sounds as porcine abnormal sounds. The errors of porcine abnormal sounds detection and porcine abnormal sounds classification are superimposed in these two methods. The porcine cough sounds are easy to be recognized as other kinds of sounds by Multi-SVDD. Therefore, the recall of porcine cough sounds is not high enough. These problems can be solved by the improved Multi-SVDD to a certain extent. The accuracy, average precision and average recall are higher than the other three methods. The average recognition time for each sound by the improved Multi-SVDD is 0.0084s, which is lower than that of SVDD+BPNN and SVDD+SVM, and is basically consistent with that of the Multi-SVDD. FCM only requires comparison with reference feature parameters for classification, resulting in a relatively short recognition time. The recognition time of the improved Multi-SVDD is capable of meeting the requirements for real-time performance.
4 Conclusions
In order to recognize porcine abnormal sounds in real-time accurately and stably, an improved Multi-SVDD is proposed in this paper. Based on the experimental results, the following conclusions can be summarized:
- (1). The collected sounds were preprocessed through activity detection, noise removal, endpoint detection and windowing. The valid estimated noise extracted by traditional SS may be deficient during the porcine sounds denoising. IMCRA is applied to estimate the noise. The IMCRA-SS using IMCRA and SS is presented to improve the denoising performance during pre-processing.
- (2). After preprocessing, MFCC and first order differential MFCC were extracted as feature parameters. Compared with other feature parameters, MFCC+ΔMFCC are shown to be better for the recognition of abnormal sounds,
- (3). In order to recognize porcine cough and scream, Multi-SVDD is proposed in this study. In order to improve the error-tolerance of Multi-SVDD for human errors on tagging training data, the space density information of training data were calculated as the confidences to reduce the interference of outliers in the process of Multi-SVDD training. The proposed improved Multi-SVDD has good performance in terms of accuracy and stability.
In conclusion, the new method proposed in this paper can recognize porcine cough and scream accurately and stably. It demonstrates a highly effective capability in recognizing specific kinds of sounds from the collected unknown sounds. Therefore, this method can also be utilized for the recognition of specific kinds of sounds in other animal species. Moreover, this method demonstrates application potential in other specific anomaly detection scenarios by extracting the corresponding feature parameters. The drawback of the present paper is that it may cause recognition errors when collecting mixtures of different types of porcine sounds. Additionally, when porcine abnormal sounds are detected, the specific location of the abnormal pigs cannot be determined. Our future work will focus on extracting effective sounds from mishmashed sounds and fusing the abnormal sounds recognition and localization. In future work, the influence of the training sample size on the model’s recognition accuracy will be examined, blind source separation methods will be investigated to separate mixed sounds, and the accurate localization of sound sources following porcine abnormal sound recognition will also be explored.
References
- 1. Aerts JM, Jans P, Halloy D, Gustin P, Berckmans D. Labeling of cough from pigs for on-line disease monitoring by sound analysis. Trans. ASAE. 2005; 48: 351-354.
- 2. Chung Y, Oh S, Lee J, Park D, Chang H-H, Kim S. Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors (Basel). 2013;13(10):12929–42. pmid:24072029
- 3. Wang D, Wang X, Lv S. An overview of end-to-end automatic speech recognition. Symmetry. 2019;11(8):1018.
- 4. Mustaqeem, Kwon S. A CNN-Assisted Enhanced Audio Signal Processing for Speech Emotion Recognition. Sensors (Basel). 2019;20(1):183. pmid:31905692
- 5. Forti LR, Lingnau R, Encarnação LC, Bertoluci J, Toledo LF. Can treefrog phylogeographical clades and species’ phylogenetic topologies be recovered by bioacoustical analyses?. PLoS One. 2017;12(2):e0169911. pmid:28235089
- 6. Handcock RN, Swain DL, Bishop-Hurley GJ, Patison KP, Wark T, Valencia P, et al. Monitoring Animal Behaviour and Environmental Interactions Using Wireless Sensor Networks, GPS Collars and Satellite Remote Sensing. Sensors (Basel). 2009;9(5):3586–603. pmid:22412327
- 7. Frommolt K-H, Tauchert K-H. Applying bioacoustic methods for long-term monitoring of a nocturnal wetland bird. Ecological Informatics. 2014;21:4–12.
- 8. Xie J, Towsey M, Zhang J, Roe P. Acoustic classification of Australian frogs based on enhanced features and machine learning algorithms. Applied Acoustics. 2016;113:193–201.
- 9. Alonso JB, Cabrera J, Shyamnani R, Travieso CM, Bolaños F, García A, et al. Automatic anuran identification using noise removal and audio activity detection. Expert Systems with Applications. 2017;72:83–92.
- 10. Bishop JC, Falzon G, Trotter M, Kwan P, Meek PD. Livestock vocalisation classification in farm soundscapes. Computers and Electronics in Agriculture. 2019;162:531–42.
- 11. Silva M, Exadaktylos V, Ferrari S, Guarino M, Aerts J-M, Berckmans D. The influence of respiratory disease on the energy envelope dynamics of pig cough sounds. Computers and Electronics in Agriculture. 2009;69(1):80–5.
- 12. Aydin A, Berckmans D. Using sound technology to automatically detect the short-term feeding behaviours of broiler chickens. Computers and Electronics in Agriculture. 2016;121:25–31.
- 13. Deniz NN, Chelotti JO, Galli JR, Planisich AM, Larripa MJ, Leonardo Rufiner H, et al. Embedded system for real-time monitoring of foraging behavior of grazing cattle using acoustic signals. Computers and Electronics in Agriculture. 2017;138:167–74.
- 14. Milone DH, Rufiner HL, Galli JR, Laca EA, Cangiano CA. Computational method for segmentation and classification of ingestive sounds in sheep. Computers and Electronics in Agriculture. 2009;65(2):228–37.
- 15. Moshou D, Chedad A, Hirtum AV, Baerdemaeker JD, Berckmans D, Ramon H. An intelligent alarm for early detection of swine epidemics based on neural networks. Trans. ASAE. 2001; 44: 457-457.
- 16. Cordeiro AF da S, Nääs I de A, Baracho M dos S, Jacob FG, Moura DJ de. The use of vocalization signals to estimate the level of pain in piglets. Eng Agríc. 2018;38(4):486–90.
- 17. Wang X, Yin Y, Dai X, Shen W, Kou S, Dai B. Automatic detection of continuous pig cough in a complex piggery environment. Biosystems Engineering. 2024;238:78–88.
- 18. Pan W, Li H, Zhou X, Jiao J, Zhu C, Zhang Q. Research on pig sound recognition based on deep neural network and hidden markov models. Sensors (Basel). 2024;24(4):1269. pmid:38400427
- 19. Hu X, Tang T, Tan L, Zhang H. Fault detection for point machines: a review, challenges, and perspectives. Actuators. 2023;12(10):391.
- 20. Chen F, Zhao Z, He X, Hu X, Chen J, Liu P, et al. Quantification of abnormal characteristics and flow-patterns identification in pumped storage system. Nonlinear Dyn. 2024;112(23):20813–48.
- 21. Ferrari S, Silva M, Guarino M, Aerts JM, Berckmans D. Cough sound analysis to identify respiratory infection in pigs. Comput Electron Agric. 2008;64(2):318–25.
- 22. Vandermeulen J, Bahr C, Tullo E, Fontana I, Ott S, Kashiha M, et al. Discerning pig screams in production environments. PLoS One. 2015;10(4):e0123111. pmid:25923725
- 23. Zhang S, Tian J, Banerjee A, Li J. Automatic recognition of porcine abnormalities based on a sound detection and recognition system. Trans ASABE. 2019;62(6):1755–65.
- 24. Siam AI, El-khobby HA, Elnaby MMA, Abdelkader HS, El-Samie FEA. A novel speech enhancement method using fourier series decomposition and spectral subtraction for robust speaker identification. Wirel Pers Commun. 2019;108(2):1055–68.
- 25. Pardede H, Ramli K, Suryanto Y, Hayati N, Presekal A. Speech enhancement for secure communication using coupled spectral subtraction and wiener filter. Electronics. 2019;8(8):897.
- 26. Lu C-T, Lei C-L, Shen J-H, Wang L-L, Tseng K-F. Estimation of noise magnitude for speech denoising using minima-controlled-recursive-averaging algorithm adapted by harmonic properties. Appl Sci. 2016;7(1):9.
- 27. Yuan W, Xia B. A speech enhancement approach based on noise classification. Appl Acoust. 2015;96:11–9.
- 28.
Hirsch HG, Ehrlicher C. Noise estimation techniques for robust speech recognition. In: Proceedings of the 1995 International Conference on Acoustics, Speech, and Signal Processing, Detroit, MI, USA, 1995. 153–6.
- 29. Wang X, Zhao X, He Y, Wang K. Cough sound analysis to assess air quality in commercial weaner barns. Comput Electron Agric. 2019;160:8–13.
- 30. Sun R, Tsung F. A kernel-distance-based multivariate control chart using support vector methods. Int J Prod Res. 2003;41(13):2975–89.
- 31. Nayak PC, Sudheer KP. Fuzzy model identification based on cluster estimation for reservoir inflow forecasting. Hydrol Process. 2007;22(6):827–41.
- 32. Fu L, Duan J, Zou X, Lin G, Song S, Ji B, et al. Banana detection based on color and texture features in the natural environment. Comput Electron Agric. 2019;167:105057.
- 33. Kurtulmus F, Lee WS, Vardar A. Green citrus detection using ‘eigenfruit’, color and circular Gabor texture features under natural outdoor conditions. Comput Electron Agric. 2011;78(2):140–9.