Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Feature fusion method for pulmonary tuberculosis patient detection based on cough sound

  • Wenlong Xu ,

    Contributed equally to this work with: Wenlong Xu, Xiaofan Bao

    Roles Conceptualization, Formal analysis, Project administration, Writing – review & editing

    wenlongxu@cjlu.edu.cn (WX); 761201154@qq.com (XL)

    Affiliation College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China

  • Xiaofan Bao ,

    Contributed equally to this work with: Wenlong Xu, Xiaofan Bao

    Roles Data curation, Writing – original draft, Writing – review & editing

    Affiliation College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China

  • Xiaomin Lou ,

    Roles Formal analysis, Investigation

    wenlongxu@cjlu.edu.cn (WX); 761201154@qq.com (XL)

    Affiliation Hangzhou Red Cross Hospital, Hangzhou, Zhejiang, China

  • Xiaofang Liu,

    Roles Writing – review & editing

    Affiliation College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China

  • Yuanyuan Chen,

    Roles Formal analysis, Funding acquisition, Investigation

    Affiliation Hangzhou Red Cross Hospital, Hangzhou, Zhejiang, China

  • Xiaoqiang Zhao,

    Roles Data curation, Formal analysis

    Affiliation Hangzhou Red Cross Hospital, Hangzhou, Zhejiang, China

  • Chenlu Zhang,

    Roles Data curation, Formal analysis

    Affiliation Hangzhou Red Cross Hospital, Hangzhou, Zhejiang, China

  • Chen Pan,

    Roles Data curation, Methodology, Software

    Affiliation College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China

  • Wenlong Liu,

    Roles Data curation, Resources

    Affiliation College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China

  • Feng Liu

    Roles Conceptualization, Methodology, Supervision

    Affiliation School of Information Technology and Electrical Engineering, University of Queensland, Brisbane, Queensland, Australia

Abstract

Since the COVID-19, cough sounds have been widely used for screening purposes. Intelligent analysis techniques have proven to be effective in detecting respiratory diseases. In 2021, there were up to 10 million TB-infected patients worldwide, with an annual growth rate of 4.5%. Most of the patients were from economically underdeveloped regions and countries. The PPD test, a common screening method in the community, has a sensitivity of as low as 77%. Although IGRA and Xpert MTB/RIF offer high specificity and sensitivity, their cost makes them less accessible. In this study, we proposed a feature fusion model-based cough sound classification method for primary TB screening in communities. Data were collected from hospitals using smart phones, including 230 cough sounds from 70 patients with TB and 226 cough sounds from 74 healthy subjects. We employed Bi-LSTM and Bi-GRU recurrent neural networks to analyze five traditional feature sets including the Mel frequency cepstrum coefficient (MFCC), zero-crossing rate (ZCR), short-time energy, root mean square, and chroma_cens. The incorporation of features extracted from the speech spectrogram by 2D convolution training into the Bi-LSTM model enhanced the classification results. With traditional futures, the best TB patient detection result was achieved with the Bi-LSTM model, with 93.99% accuracy, 93.93% specificity, and 92.39% sensitivity. When combined with a speech spectrogram, the classification results showed 96.33% accuracy, 94.99% specificity, and 98.13% sensitivity. Our findings underscore that traditional features and deep features have good complementarity when fused using Bi LSTM modelling, which outperforms existing PPD detection methods in terms of both efficiency and accuracy.

Introduction

The World Health Organization reports that Tuberculosis (TB) ranks as the second most fetal infectious disease following the corona virus disease of 2019 (COVID-19) and is the 13th leading cause of death in the worldwide. TB is also the "number one killer" of people living with HIV [1]. Most TB cases are recorded in developing and undeveloped countries, where adequate medical facilities for tuberculosis screening and treatment are often unaffordable [2]. Delayed diagnosis, or failure to receive diagnosis and treatment exacerbates the spread of tuberculosis, imposing a heavier social burden. In South Africa, poor citizens experience an average of 33-days gap between the first TB symptoms and the first TB test, for poorer citizens, this interval increases to 90 days, and one untreated infectious TB patient can infect 10 - 15 people per year [3]. Early diagnosis is vital for the treatment of TB and the control of transmission. Xpert MTB/RIF, Interferon Gamma Release Assay (IGRA), and Purified Protein derivative (PPD) are currently the most used screening methods for TB. Xpert MTB/RIF and IGRA have high accuracy, however they are complex in operation, expensive, and require special laboratory facilities [47]. Thus the methods are unsuitable for large-scale screening. PPD is usually used for community screening due to its simplicity and low cost. Unfortunately, the average sensitivity and specificity of PPD are only 77% and 79% respectively. A more critical flaw is that the test results are susceptible to the BCG vaccine, which can lead to higher total costs of screening [810]. An accurate, simple, and economical screening method is crucial for the timely diagnosis and control of pulmonary tuberculosis.

Research on the diagnosis of respiratory diseases based on audio signals, such as breath sounds, voice, and cough sounds, has always been received significant attention, especially since the outbreak of COVID-19. Owing to its non-contact collection characteristics, research focusing on the analysis of cough sounds has experienced the most rapid growth. Auscultation-based respiratory sounds are traditional clinical diagnostic methods. Sound can reflect changes in lung tissues, organs, bronchus secretions, and carry signals related to lung and respiratory abnormalities. Deep learning has been widely used in medical-related classification tasks, particularly in the classification of tumors [11,12]. The diagnosis of various types and severities of respiratory diseases can be facilitated by analyzing autonomous or stimulated human sounds, such as breathing and coughing [13]. Breath sounds and voice analysis have been used in diagnosing Chronic Obstructive Pulmonary Disease (COPD) [14], corona virus disease (COVID-19) [15], and chronic lung virus infections [16]. In addition, cough sounds have been used to discriminate COPD and asthma [1719] in children [20,21], and diagnose TB [2], and COVID-19 [2224].

The present research

The present study analyzed cough data from hospitals to: i) verify whether pulmonary tuberculosis screening can be achieved based on traditional signal features of cough sounds, ii) explore which model can achieve better detection and classification performance, iii) verify whether the traditional characteristic signals of cough sounds are complementary to the speech spectrogram, and IV) explore whether AI models based on feature fusion are expected to achieve more effective pulmonary tuberculosis screening than current PPD methods.

Related work

Screening for lung disease based on acoustic

Recently acoustic signals have been widely used for the screening and evaluation of lung diseases. Compared with traditional radiography, CT, spirometry, and bronchoscopy, it is simpler and less expensive to perform and can reduce exposure, thereby helping to prevent the spread of infectious diseases. Grant [15] proposed a method to detect coronary pneumonia using speech and breath sounds by extracting the Mel Frequency Cepstrum Coefficient (MFCC) and RASTA-PLP features, with 81 patients with coronary pneumonia and 1118 patients without COVID-19, using random forest and deep neural networks. This resulted in the optimal detection of speech and mutual trust, with an AUC of 0.7938 for speech detection and 0.7575 for breath sound detection. Khan [16] used breath sounds to classify chronic lung viruses by extracting five non-linear dynamic system features and inputting them into a K-nearest neighbor classifier. They obtained 99.4% accuracy, 99.99% sensitivity, 97.82% specificity, and 99% precision after 5-fold cross-validation.

Screening for lung disease based on cough sound

Cough is the most common symptom of many respiratory diseases. Combined with artificial intelligence technology, especially deep learning, there is increasing research on classifying lung diseases based on cough sounds. Features of cough sound signals such as non-Gaussianity, zero-crossing rate (ZCR), Mel Cepstra, MFCC, zero-crossing irregularity, and symptoms are widely used for the screening of lung diseases. In addition, different artificial intelligence models, such as logistic regression classifiers (LR) and support vector machines (SVM), are selected according to the number of samples and features. Long short-term memory networks (LSTM) and bidirectional long short-term memory networks (Bi-LSTM) are typically used to classify audio signals. GoogleNet, ResNet18, ResNet50, ResNet101are models which are commonly used in image and spectrogram-based classification. Infante [17] extracted the zero crossing irregularity and rate of decay features of cough sounds and then used an LR model to classify patients with different respiratory diseases. They obtained an AUC of 0.94. Liao [13] extracted MFCC features and used an SVM to achieve an accuracy of up to 86.04% and standard deviation of 4.7% for lung disease classification. Balamurali [20] extracted MFCCS and used Bi-LSTM to identify asthma, upper respiratory tract infections, and lower respiratory tract infections. Loey [23] transformed speech into a speech spectrogram and used GoogleNet, ResNet18, ResNet50, ResNet101, MobileNetv2, and NasNetmobile models to screen for new coronary pneumonia, and finally obtained 94.44% sensitivity and 95.37% specificity with ResNet18. Abeyratne [25] extracted non-Gaussianity and Mel Cepstra and obtained a sensitivity of 94% and specificity of 75% using LR.

Tuberculosis screening based on cough sounds

Tuberculosis is an infectious disease. Botha [3] classified 17 patients with TB and 21 healthy subjects by extracting MFCC features in cough sounds with an accuracy of 78% and an AUC of 0.94. LR. Pahar [26] classified 16 TB patients and 35 healthy subjects with 95% specificity and 93% sensitivity. Pahar [27] differentiated 47 patients with TB, 229 patients with neo conjunctivitis, and 1498 healthy subjects using the CNN, LSTM, and Resnet50 models. The best result was achieved in the Resnet50 model, with 92.59% accuracy with a model in differentiating TB from COVID-19, and 86.31% in accuracy with triple classification. Frost [2] used a modified Bi-LSTM to classify cough sounds in 28 patients with TB and 46 healthy subjects using mel-spectrograms, linear filter-bank energies, and MFCCs. Although all these studies have stimulated and encouraged the possibility of using cough sounds for the diagnosis of pulmonary tuberculosis, to the best of our knowledge, the literature so far often has two shortcomings: (1) the amount of tuberculosis patient data used was less than 50 cases and (2) only traditional feature-based analysis or machine learning methods were mainly used.

Methods

The experimental information before and after feature fusion is shown in Figs 1 and 2. The experiment in Fig 1 uses traditional feature extraction methods for classification. The experiment in Fig 2 combines the features obtained through training from spectrograms and the deep features obtained through training from traditional features, This combined approach utilizes a classification layer for classification purposes.

Dataset

The dataset was collected using a smartphone at the Hangzhou Red Cross Hospital, Hangzhou, Zhejiang, China. The study was approved by the Ethics Committee of Hangzhou Red Cross Hospital (approval no. [2023] (025) at 2023.2.20). The cough sounds, sex, age, and weight of each subject were collected. Data Desensitization Processing was performed on the collected data and differentiated patients with specialized numbers. The microphone of the mobile device was placed approximately 30 to 40 cm away from the mouth during the collection, and the mobile device was angled upwards at approximately 45°. The data collection process was conducted in a quiet environment under the guidance of a doctor onsite. Each participant was required to cough three or more times, with sufficient time intervals between every two to ensure that each cough was fully aspirated. A total of 144 cough sound files were collected from 70 patients with confirmed TB and 74 healthy individuals. The composition of the data objects is presented in Table 1. The average duration of each cough sound recording was 20 seconds. The Audacity software was used to extract 456 cough sound segments from 144 cough sound records, with each cough sound segment lasting 0.35 seconds. The process is shown in Fig 3, and the composition of the cough sound segments is listed in Table 2. The ratio of the training set to the test set was 8:2.

Feature extraction

In the literature, MFCC and ZCR have been widely used for cough sound classification. The experiment divided each cough sound segment into 31 frames, extracted 44 MFCCs, and one over-zero rate per frame. Then, the Bi-LSTM model was applied for training and classification. The results show that it cannot achieve good classification results with MFCC and ZCR features. Subsequently, the features of the short-time energy (TSE), RMS, and chroma_cens were added individually in the experiment Table 3. The speech spectrogram of each cough sound fragment was generated and imported for classification using a convolutional neural network [28]. Finally, we proposed a method to fuse the features extracted by both approaches [29], and then trained them with recurrent neural networks and convolutional neural networks, respectively, and fused the deep features obtained after training.

MFCC extraction.

With the steps of preemphasis, frame blocking, windowing, FFT, spectral line energy calculation, and DCT, 44 MFCC features were calculated for each cough sound frame. The detail formula is expressed as follows: (1) where i=1,2,…44, Mi denotes the coefficient of MFCC, j=1,2,…, |Xj| at equal intervals of frequencies of each frame signal denotes the amplitude of each sampling point.

ZCR extraction.

We extracted the ZCR features of each cough sound frame by calculating the number of times the sampled value passed through the zero point, as shown in Eqs (2) and (3): (2) (3) where x(n) denotes the sample value of the nth sample point, sign(x(n)) denotes the sign function that takes x(n), Z(n) denotes the ZCR value of the nth sample point, and the ZCR of the entire audio signal is calculated.

Chroma_cens extraction.

Each cough sound frame was Fourier transformed to obtain the frequency-domain signal, which was then divided into 12 equal-width frequency bands. For each bandwidth, the sum of the energies in the bandwidth was calculated and the energy was logarithmic to obtain the chroma index. Finally, a sliding average was performed on each chroma index to obtain the chroma_cens feature. The formula is shown in Eq (4): (4) where t denotes the time frame, p denotes the chroma note, B(p) is the frequency band corresponding to chroma note p, Ei(t) is the amplitude of frequency band i in time frame t, H is a band-pass filter, cn denotes the reference chroma note, and w is a Hann window function.

Conversion method of spectrogram.

A Hanning window with a width of 256 [30] was applied to the cough sound fragment. Then, each window signal was fast Fourier-transformed. The mode length of the Fourier coefficient calculation was followed, and the logarithm of the mode length was obtained. Finally, the logarithmic values of all windows were combined according to certain rules to obtain the speech spectrum map. The corresponding formulas are given in Eqs (5) and (6): (5) (6) where f is the frequency, N is the number of points for FFT, w(n) is the window function, j is an imaginary unit, and Xi(f) is the amplitude spectrum.

Evaluation indicators

The classification results were compared using the three indicators of accuracy, specificity and sensitivity, and the ROC curve was used to measure the performance of the model. The formulas for the accuracy, specificity, and sensitivity are as follows: (7) (8) (9)

Cough sound classification model

LSTM, bidirectional gated recurrent neural network (Bi-GRU) and gated recurrent neural network (GRU), models are commonly used time series models in research of cough sound classification. These were selected for this study. As we previously verified that the best classification results were achieved with five features used simultaneously, the classification performance of the three models was evaluated. The scheme is shown in Fig 4 and the specific structure is shown in Table 4. In addition, two-dimensional convolutional neural network (Conv2D) was a commonly used image classification model. As the data size in the study was restricted, a relatively basic Conv2D model was used. The model diagram is shown in Fig 5, and the specific structure is listed in Table 5.

Tuberculosis detection model based on feature fusion

To improve the classification performance, we propose a feature fusion method that fuses the deep features obtained from five features trained by a recurrent neural network with the features extracted from the speech spectral map trained by the convolutional neural network. The performance of the fully connected layer was used for the classification model. The model diagram is shown in Fig 6 and its specific structure is shown in Table 6. Finally, the classification performance was evaluated by comparing specificity, sensitivity, and accuracy, and the model was evaluated using ROC curves.

Result

Four models, Bi-LSTM, Bi-GRU, LSTM, and GRU, were trained with the five trained features, MFCC, ZCR, TSE, RMS, and chroma_cens, on each cough sound fragment. The classification results obtained after 10 random partition validation averages are shown in Table 7. The results showed that Bi-LSTM performed better than the other three models in terms of accuracy and specificity. However, in terms of sensitivity, the Bi-GRU achieved the best performance. Owing to the relatively small data size, a simpler two-dimensional convolution was used to classify the speech spectrogram. Two models, Bi-LSTM and Conv2D, were selected for training and classification after the feature fusion. The classification results are listed in Table 8. The results showed that fusion of the features extracted by both approaches yielded 96.33% accuracy, 94.99% specificity, and 98.96% sensitivity. Compared with the condition with only features used in one way, each parameter showed improvement. The ROC curves of each model are shown in Fig 7, where the AUC score of feature fusion was 0.95, and Bi-LSTM reached the best value of 0.92 among all the models. The experimental results show that the inclusion of speech spectrogram features can improve the classification performance.

thumbnail
Fig 7.

ROC curve of each model, where (a) shows the roc curve of Bi-GRU, (b) shows the roc curve of Bi-LSTM, (c) shows the roc curve of LSTM, (d) the roc curve of Bi-LSTM, (e) shows the roc curve of Covn2D, (f) shows the roc curve of Feature Fusion.

https://doi.org/10.1371/journal.pone.0302651.g007

thumbnail
Table 7. Classification results of audio features by 4 models.

https://doi.org/10.1371/journal.pone.0302651.t007

thumbnail
Table 8. Comparison of classification results after adding the Spectrogram.

https://doi.org/10.1371/journal.pone.0302651.t008

Discussion

A comparative study was conducted to explore artificial intelligence (AI) classification for detecting TB through cough sounds. The present study, with a data size relatively larger compared to those works published in the literature, examined both traditional and deep learning features individually and in the combination. Several AI classification models were evaluated for comparison. The study has limitations, mainly associated with the data size issue. Compared with those in the image domain of deep learning, the data size may be less sufficient for deep learning to achieve high accuracy and robustness. The data of this study was obtained under the control of environmental noise, and its application in practice may be interfered with environmental noise. In addition, there could be variation when applying this method to different populations across various regions. Therefore, further investigation into these potential differences should be conducted. Moreover, how an average of three coughs from the same subject on the research results should be evaluated. Despite these challenges, the results indicate that cough-based tuberculosis screening could become a viable community screening tool in the future. It offers the advantages of high accuracy and low cost for primary screening.

Conclusion

This study presents a novel screening technique for TB within an existing setting. It employs a feature fusion-based approach to differentiate cough sounds between patients with TB and those without respiratory diseases. When evaluated on parameters such as accuracy, specificity and sensitivity, the proposed method outperforms the currently used PPD screening method. Considering the factors like costs, expenditure, and application scenario, this new method has great potential for use. In conclusion, this technique is suitable for primary community TB screening, offering significant cost benefits and convenience compared with existing TB screening methods.

References

  1. 1. Global Tuberculosis Report 2022 Available online at https://www.who.int/publications/i/item/9789240061729.
  2. 2. Frost G., Theron G.,Niesler T., TB or not TB? Acoustic cough analysis for tuberculosis classification. 2022.
  3. 3. Botha G.H.R., Theron G., Warren R.M., Klopper M., Dheda K., van Helden P.D., et al., Detection of tuberculosis by automatic cough sound analysis. Physiol Meas, 2018. 39(4): p. 045005. pmid:29543189
  4. 4. Truden S., Sodja E.,Zolnir-Dovc M., Drug-Resistant Tuberculosis on the Balkan Peninsula: Determination of Drug Resistance Mechanisms with Xpert MTB/XDR and Whole-Genome Sequencing Analysis. MICROBIOLOGY SPECTRUM. pmid:36877052
  5. 5. Zimba O., Tamuhla T., Basotli J., Letsibogo G., Pals S., Mathebula U., et al., The effect of sputum quality and volume on the yield of bacteriologically-confirmed TB by Xpert MTB/RIF and smear. Pan Afr Med J, 2019. 33: p. 110. pmid:31489088
  6. 6. Yang Y., Wang H.J., Hu W.L., Bai G.N.,Hua C.Z., Diagnostic Value of Interferon-Gamma Release Assays for Tuberculosis in the Immunocompromised Population. DIAGNOSTICS, 2022. 12(2). pmid:35204544
  7. 7. Peña M C., Tuberculosis latente: diagnóstico y tratamiento actual X1 - Latent tuberculosis: current diagnosis and treatment. Revista chilena de enfermedades respiratorias, 2022. 38(2): p. 123–130.
  8. 8. Guo X.A., Du W.X., Li J.L., Dong J.X., Shen X.B., Su C., et al., A Comparative Study on the Mechanism of Delayed-Type Hypersensitivity Mediated by the Recombinant Mycobacterium tuberculosis Fusion Protein ESAT6-CFP10 and Purified Protein Derivative. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023. 24(23). pmid:38068935
  9. 9. Liu Z., Diao S., Zeng L.A., Liu D., Jiao X.F., Chen Z., et al., Recombinant mycobacterium tuberculosis fusion protein for diagnosis of mycobacterium tuberculosis infection: a short-term economic evaluation. FRONTIERS IN PUBLIC HEALTH, 2023. 11. pmid:37206861
  10. 10. Kabeer B.S.A., Raman B., Thomas A., Perumal V.,Raja A., Role of QuantiFERON-TB Gold, Interferon Gamma Inducible Protein-10 and Tuberculin Skin Test in Active Tuberculosis Diagnosis. PLOS ONE, 2010. 5(2).
  11. 11. Yousef R., Khan S., Gupta G., Siddiqui T., Albahlal B.M., Alajlan S.A., et al., U-Net-Based Models towards Optimal MR Brain Image Segmentation. Diagnostics, 2023. 13(9): p. 1624. pmid:37175015
  12. 12. Haq M.A., KHAN I., AHMED A., ELDIN S.M., ALSHEHRI A.,GHAMRY N.A., DCNNBT: A novel deep convolution neural network-based brain tumor classification model. Fractals, 2023: p. 2340102.
  13. 13. Liao S., Song C., Wang X.,Wang Y., A classification framework for identifying bronchitis and pneumonia in children based on a small-scale cough sounds dataset. PLOS ONE, 2022. 17: p. e0275479. pmid:36301797
  14. 14. Radogna A.V., Fiore N., Tumolo M.R., De Luca V., De Paolis L.T., Guarino R., et al., Exhaled breath monitoring during home ventilo-therapy in COPD patients by a new distributed tele-medicine system. Journal of Ambient Intelligence and Humanized Computing, 2021. 12: p. 4419–4427.
  15. 15. Grant D., McLane I.,West J., Rapid and Scalable COVID-19 Screening using Speech, Breath, and Cough Recordings. 2021: p. 1–6.
  16. 16. Khan M.U., Farman A., Rehman A.U., Israr N., Ali M.Z.H.,Gulshan Z.A., Automated System Design for Classification of Chronic Lung Viruses using Non-Linear Dynamic System Features and K-Nearest Neighbour. 2021: p. 1–8.
  17. 17. Infante, C., D.B. Chamberlain, R. Kodgule, R.R. Fletcher, Classification of voluntary coughs applied to the screening of respiratory disease. 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017.
  18. 18. Infante C., Chamberlain D., Fletcher R., Thorat Y.,Kodgule R., Use of cough sounds for diagnosis and screening of pulmonary disease. 2017 IEEE Global Humanitarian Technology Conference (GHTC), 2017.
  19. 19. Knocikova J., Korpas J., Vrabec M.,Javorka M., WAVELET ANALYSIS OF VOLUNTARY COUGH SOUND IN PATIENTS WITH RESPIRATORY DISEASES. JOURNAL OF PHYSIOLOGY AND PHARMACOLOGY, 2008. 59: p. 331–340. pmid:19218657
  20. 20. Balamurali B.T., Hee H.I., Kapoor S., Teoh O.H., Teng S.S., Lee K.P., et al., Deep Neural Network-Based Respiratory Pathology Classification Using Cough Sounds. SENSORS, 2021. 21(16). pmid:34450996
  21. 21. Swarnkar V., Abeyratne U., Tan J., Ng T.W., Brisbane J.M., Choveaux J., et al., Stratifying asthma severity in children using cough sound analytic technology. JOURNAL OF ASTHMA, 2021. 58(2): p. 160–169. pmid:31638844
  22. 22. Pahar M., Klopper M., Warren R.,Niesler T., COVID-19 cough classification using machine learning and global smartphone recordings. Comput Biol Med, 2021. 135: p. 104572. pmid:34182331
  23. 23. Loey M.S. Mirjalili, COVID-19 cough sound symptoms classification from scalogram image representation using deep learning models. Comput Biol Med, 2021. 139: p. 105020.
  24. 24. Andreu-Perez J., Perez-Espinosa H., Timonet E., Kiani M., Giron-Perez M.I., Benitez-Trinidad A.B., et al., A Generic Deep Learning Based Cough Analysis System From Clinically Validated Samples for Point-of-Need Covid-19 Test and Severity Levels. IEEE, 2022. pmid:35936760
  25. 25. Abeyratne U.R., Swarnkar V., Setyati A.,Triasih R., Cough sound analysis can rapidly diagnose childhood pneumonia. Ann Biomed Eng, 2013. 41(11): p. 2448–62. pmid:23743558
  26. 26. Pahar M., Klopper M., Reeve B., Warren R., Theron G.,Niesler T., Automatic cough classification for tuberculosis screening in a real-world environment. Physiological Measurement, 2021. 42: p. 105014.
  27. 27. Pahar M., Klopper M., Reeve B., Warren R., Theron G., Diacon A., et al., Automatic Tuberculosis and COVID-19 cough classification using deep learning. 2022: p. 1–9.
  28. 28. Mulimani, M., S.G. Koolagudi,Ieee, Acoustic Event Classification Using Spectrogram Features, in PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE. 2018. p. 1460-1464.
  29. 29. Sudha, D., M. Ramakrishna,Ieee, Comparative Study of Features Fusion Techniques, in 2017 INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN ELECTRONICS AND COMMUNICATION TECHNOLOGY (ICRAECT). 2017. p. 235-239.
  30. 30. Murphy D.T., Ioup E., Hoque M.T.,Abdelguerfi M., Residual Learning for Marine Mammal Classification. IEEE ACCESS, 2022. 10: p. 118409–118418.