The authors would like to thank Cambridge University for sharing the data collected via the COVID-19 app, and the Indian Institute of Science Bangalore for opening the Coswara dataset which supported our model validation.
The authors declare that they have no conflicts of interest.
Ethical approval
This study was approved by the Ethical Review Committee Inner City Faculties (ERCIC), the reference number is ERCIC_528_31_01_2024.
Consent to participate
Informed consent was obtained by the data sources for the study.
Consent to publication
Not applicable.
Availability of data and materials
The Cambridge dataset is accessed by a Data-Sharing Agreement with Cambridge University. The Coswara dataset is public, and can be accessed online: https://github.com/iiscleap/Coswara-Data.
Funding
The work was supported by NWO Aspasia grant [91716421]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Lai CC, Shih TP, Ko WC, Tang HJ, Hsueh PR. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges.Int J Antimicrob Agents. 2020;55:105924. [DOI] [PubMed] [PMC]
Science brief: SARS-CoV-2 and surface (fomite) transmission for indoor community environments [Internet].[Cited 2024 Feb 1]. Available from: https://stacks.cdc.gov/view/cdc/104762
Ningthoujam R. COVID 19 can spread through breathing, talking, study estimates.Curr Med Res Pract. 2020;10:132–3. [DOI] [PubMed] [PMC]
Han J, Xia T, Spathis D, Bondareva E, Brown C, Chauhan J, et al. Sounds of COVID-19: exploring realistic performance of audio-based digital testing.NPJ Digit Med. 2022;5:16. [DOI] [PubMed] [PMC]
Stasak B, Huang Z, Razavi S, Joachim D, Epps J. Automatic Detection of COVID-19 Based on Short-Duration Acoustic Smartphone Speech Analysis.J Healthc Inform Res. 2021;5:201–17. [DOI] [PubMed] [PMC]
Hassan A, Shahin I, Alsabek MB. COVID-19 detection system using recurrent neural networks. In: The 2020 International Conference on Communications, Computing, Cybersecurity, and Informatics; 2020 Nov 3-5; Sharjah, United Arab Emirates. 2020. [DOI]
Mehrabadi MA, Aqajari SAH, Azimi I, Downs CA, Dutt N, Rahmani AM. Detection of COVID-19 Using Heart Rate and Blood Pressure: Lessons Learned from Patients with ARDS.Annu Int Conf IEEE Eng Med Biol Soc. 2021;2021:2140–3. [DOI] [PubMed] [PMC]
Liang JS, Wang K. Vibration feature extraction using audio spectrum analyzer based machine learning. In: 2017 International conference on information, Communication and Engineering (ICICE); 2017 Nov 17-20; Xiamen, China. IEEE; 2017. pp. 381–4. [DOI]
Brown C, Chauhan J, Grammenos A, Han J, Hasthanasombat A, Spathis D, et al. Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data.ArXiv [Preprint]. 2021 [cited 2024 Feb 1]. Available from: https://arxiv.org/abs/2006.05919
Vahedian-Azimi A, Keramatfar A, Asiaee M, Atashi SS, Nourbakhsh M. Do you have COVID-19? An artificial intelligence-based screening tool for COVID-19 using acoustic parameters.J Acoust Soc Am. 2021;150:1945–53. [DOI]
Despotovic V, Ismael M, Cornil M, Call RM, Fagherazzi G. Detection of COVID-19 from voice, cough and breathing patterns: Dataset and preliminary results.Comput Biol Med. 2021;138:104944. [DOI] [PubMed] [PMC]
Arshadi M, Fardsanei F, Deihim B, Farshadzadeh Z, Nikkhahi F, Khalili F, et al. Diagnostic Accuracy of Rapid Antigen Tests for COVID-19 Detection: A Systematic Review With Meta-analysis.Front Med (Lausanne). 2022;9:870738. [DOI] [PubMed] [PMC]
Aly M, Rahouma KH, Ramzy SM. Pay attention to the speech: COVID-19 diagnosis using machine learning and crowdsourced respiratory and speech recordings.Alexandria Eng J. 2022;61:3487–500. [DOI]
Chang Y, Jing X, Ren Z, Schuller BW. CovNet: A Transfer Learning Framework for Automatic COVID-19 Detection From Crowd-Sourced Cough Sounds.Front Digit Health. 2022;3:799067. [DOI] [PubMed] [PMC]
Nassif AB, Shahin I, Bader M, Hassan A, Werghi N. COVID-19 detection systems using deep-learning algorithms based on speech and image data.Mathematics. 2022;10:564. [DOI]
Schuller BW, Batliner A, Bergler C, Mascolo C, Han J, Lefter I, et al. The INTERSPEECH 2021 computational paralinguistics challenge: COVID-19 cough, COVID-19 speech, escalation & primates.ArXiv [Preprint]. 2021 [cited 2024 Feb 1]. Available from: https://arxiv.org/pdf/2102.13468
Fagherazzi G, Fischer A, Ismael M, Despotovic V. Voice for Health: The Use of Vocal Biomarkers from Research to Clinical Practice.Digit Biomark. 2021;5:78–88. [DOI] [PubMed] [PMC]
Lella KK, Pja A. Automatic diagnosis of COVID-19 disease using deep convolutional neural network with multi-feature channel from respiratory sound data: cough, voice, and breath.Alexandria Eng J. 2022;61:1319–34. [DOI]
Suppakitjanusant P, Sungkanuparph S, Wongsinin T, Virapongsiri S, Kasemkosin N, Chailurkit L, et al. Identifying individuals with recent COVID-19 through voice classification using deep learning.Sci Rep. 2021;11:19149. [DOI] [PubMed] [PMC]
Bromuri S, Henkel AP, Iren D, Urovi V. Using AI to predict service agent stress from emotion patterns in service interactions.J Ser Manag. 2021;32:581–611. [DOI]
Verma V, Benjwal A, Chhabra A, Singh SK, Kumar S, Gupta BB, et al. A novel hybrid model integrating MFCC and acoustic parameters for voice disorder detection.Sci Rep. 2023;13:22719. [DOI] [PubMed] [PMC]
Logan B. Mel frequency cepstral coefficients for music modeling.Proc of Ismir. 2000.
Hochreiter S, Schmidhuber J. Long short-term memory.Neu Comp. 1997;9:1735–80. [DOI]
Cortes C, Vapnik V. Support-vector networks.Mach Learn. 1995;20:273–97. [DOI]
O’Shea K, Nash R. An introduction to convolutional neural networks.arXiv [Preprint]. 2015 [cited 2024 Feb 1]. Available from: https://arxiv.org/abs/1511.08458
McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity.Bull Math Biophys. 1943;5:115–33. [DOI]
Hsu WN, Bolte B, Tsai YHH, Lakhotia K, Salakhutdinov R, Mohamed A. Hubert: Self-supervised speech representation learning by masked prediction of hidden units.arXiv [Preprint]. 2021 [cited 2024 Feb 1]. Available from: https://doi.org/10.48550/arXiv.2106.07447
Solana-Lavalle G, Rosas-Romero R. Analysis of voice as an assisting tool for detection of Parkinson’s disease and its subsequent clinical interpretation.Bio Sig Proc Cont. 2021;66:102415. [DOI]
Wroge TJ, Özkanca Y, Demiroglu C, Si D, Atkins DC, Ghomi RH. Parkinson’s disease diagnosis using machine learning and voice. In: 2018 IEEE signal processing in medicine and biology symposium (SPMB); 2018 Dec 01; Philadelphia, PA, USA. IEEE; 2018. pp. 1–7. [DOI]
Hamdi S, Oussalah M, Moussaoui A, Saidi M. Attention-based hybrid CNN-LSTM and spectral data augmentation for COVID-19 diagnosis from cough sound.J Intell Inf Syst. 2022;59:367–89. [DOI] [PubMed] [PMC]
Kamble MR, Patino J, Zuluaga MA, Todisco M. Exploring auditory acoustic features for the diagnosis of covid-19.arXiv [Preprint]. 2022 [cited 2024 Feb 1]. Available from: https://arxiv.org/pdf/2201.09110
Sharma N, Krishnan P, Kumar R, Ramoji S, Chetupalli SR, Ghosh PK, et al. Coswara--a database of breathing, cough, and voice sounds for COVID-19 diagnosis.arXiv [Preprint]. 2020 [cited 2024 Feb 1]. Available from: https://arxiv.org/pdf/2005.10548v2
Xia T, Spathis D, Brown C, Chauhan J, Grammenos A, Han J, et al. COVID-19 sounds: a large-scale audio dataset for digital respiratory screening. In: 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks; 2021.
Kramer O. Scikit-Learn.In: Machine Learning for Evolution Strategies. Springer, Cham; 2016. pp. 45–53. [DOI]
Singh P, Manure A. Introduction to tensorflow 2.0.In: Learn TensorFlow 20. Apress, Berkeley, CA; 2020. pp. 1–24.
Huzaifah M. Comparison of time-frequency representations for environmental sound classification using convolutional neural networks.arXiv [Preprint]. 2017 [cited 2024 Feb 1]. Available from: https://arxiv.org/abs/1706.07156
Nallanthighal VS. Respiratory health sensing from speech [dissertation]. Amsterdam: LOT; 2022. [DOI]
Gers FA, Schmidhuber J, Cummins F. Learning to forget: continual prediction with LSTM.Neural Comput. 2000;12:2451–71. [DOI] [PubMed]
Panayotov V, Chen G, Povey D, Khudanpur S. Librispeech: an asr corpus based on public domain audio books. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2015 Apr 19-24; South Brisbane, QLD, Australia. IEEE; 2015. pp. 5206–10. [DOI]
Aly M, Alotaibi NS. A novel deep learning model to detect COVID-19 based on wavelet features extracted from Mel-scale spectrogram of patients’ cough and breathing sounds.Inf Med Unl. 2022;32:101049. [DOI]
Kamarulafizam I, Salleh SH, Najeb J, Ariff AK, Chowdhury A. Heart sound analysis using MFCC and time frequency distribution. In: Magjarevic R, Nagel JH, editors. World Congress on Medical Physics and Biomedical Engineering 2006; 2006 Aug 27-Sep 1. Springer, Berlin, Heidelberg; 2006. pp. 946–9. [DOI]
Xue C, Xu X, Liu Z, Zhang Y, Xu Y, Niu J, et al. Intelligent COVID-19 screening platform based on breath analysis.J Breath Res. 2022;17:016005. [DOI] [PubMed]