ARCHIVES
Original Article
Deepfake Voice Detection Techniques for Cybercrime Prevention and Secure Digital Communication
Gowthaman M1
Gowri Shankari S2
Ishwariya N3
Janamithran K4
Sowndarya V5
1 2 3 4 B.E Computer Science and Engineering (Cyber Security), United Institute of Technology, Coimbatore, Tamilnadu, India. 5 Assistant Professor, Department of Computer Science and Engineering (Cyber Security), United Institute of Technology, Coimbatore, Tamilnadu, India.
Published Online: May-August 2026
Pages: 145-150
Cite this article
↗ https://www.doi.org/10.59256/indjcst.20260502016References
1. M. Todisco, H. Delgado, and N. Evans, "A New Feature for Automatic Speaker Verification Anti-Spoofing: Constant Q Cepstral
Coefficients," in Proc. IEEE Odyssey, 2017, pp. 283–290.
2. ASVspoof Consortium, "ASVspoof 2019: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan,"
2019.
3. X. Wang, J. Yamagishi, M. Todisco, H. Delgado, and N. Evans, "ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the
Wild," in Proc. IEEE ASRU, 2021, pp. 1–8.
4. T. Kinnunen et al., "The ASVspoof 2017 Challenge: Assessing the Limits of Replay Spoofing Attack Detection," in Proc. Interspeech, 2017,
pp. 2–6.
5. D. Snyder, G. Chen, and D. Povey, "MUSAN: A Music, Speech, and Noise Corpus," arXiv: 1510.08484, 2015.
6. V. Panayotov, G. Chen, D. Povey, and S. Khudanpur, "LibriSpeech: An ASR Corpus Based on Public Domain Audio Books," in Proc. IEEE
ICASSP, 2015, pp. 5206–5210.
7. Y. Jia et al., "Transfer Learning from Speaker Verification to Multispeaker Text-to-Speech Synthesis," in Proc. NeurIPS, 2018, pp. 4485–
4495.
8. A. Oord et al., "WaveNet: A Generative Model for Raw Audio," arXiv: 1609.03499, 2016.
9. J. Donahue et al., "Adversarial Audio Synthesis," in Proc. ICLR, 2019.
10. K. Kumar et al., "MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis," in Proc. NeurIPS, 2019.
11. S. Pascual, A. Bonafonte, and J. Serra, "SEGAN: Speech Enhancement Generative Adversarial Network," in Proc. Interspeech, 2017, pp.
3642–3646.
12. A. Lavrentyeva et al., "STC Anti-Spoofing Systems for the ASVspoof 2019 Challenge," in Proc. Interspeech, 2019, pp. 1033–1037.
13. H. Tak et al., "End-to-End Anti-Spoofing with RawNet2," in Proc. IEEE ICASSP, 2021, pp. 6369–6373.
14. Y. Zhang, F. Jiang, and Z. Duan, "One-Class Learning Towards Synthetic Voice Spoofing Detection," IEEE Signal Processing Letters, vol.
28, 2021, pp. 937–941.
15. X. Liu, X. Wu, and H. Meng, "A Light CNN for Deepfake Speech Detection," in Proc. Interspeech, 2020, pp. 971–975.
16. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in Proc. IEEE CVPR, 2016, pp. 770–778.
17. S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, vol. 9, no. 8, 1997, pp. 1735–1780.
18. T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in Proc. ACM KDD, 2016, pp. 785–794.
19. L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, 2001, pp. 5–32.
20. C. Cortes and V. Vapnik, "Support-Vector Networks," Machine Learning, vol. 20, no. 3, 1995, pp. 273–297.
21. A. Vaswani et al., "Attention Is All You Need," in Proc. NeurIPS, 2017, pp. 5998–6008.
22. J. Villalba et al., "State-of-the-Art Speaker Recognition with Neural Network Embeddings," in Proc. IEEE ICASSP, 2020, pp. 7184–7188.
23. Z. Wu et al., "Spoofing and Countermeasures for Speaker Verification: A Survey," Speech Communication, vol. 66, 2015, pp. 130–153.
24. H. Delgado et al., "Further Investigations on Deepfake Speech Detection," in Proc. Interspeech, 2020, pp. 2987–2991.
25. J. Patino et al., "Deep Learning-Based Countermeasures for Anti-Spoofing," IEEE Trans. Inform. Forensics Security, vol. 17, 2022, pp. 280–
295
Coefficients," in Proc. IEEE Odyssey, 2017, pp. 283–290.
2. ASVspoof Consortium, "ASVspoof 2019: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan,"
2019.
3. X. Wang, J. Yamagishi, M. Todisco, H. Delgado, and N. Evans, "ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the
Wild," in Proc. IEEE ASRU, 2021, pp. 1–8.
4. T. Kinnunen et al., "The ASVspoof 2017 Challenge: Assessing the Limits of Replay Spoofing Attack Detection," in Proc. Interspeech, 2017,
pp. 2–6.
5. D. Snyder, G. Chen, and D. Povey, "MUSAN: A Music, Speech, and Noise Corpus," arXiv: 1510.08484, 2015.
6. V. Panayotov, G. Chen, D. Povey, and S. Khudanpur, "LibriSpeech: An ASR Corpus Based on Public Domain Audio Books," in Proc. IEEE
ICASSP, 2015, pp. 5206–5210.
7. Y. Jia et al., "Transfer Learning from Speaker Verification to Multispeaker Text-to-Speech Synthesis," in Proc. NeurIPS, 2018, pp. 4485–
4495.
8. A. Oord et al., "WaveNet: A Generative Model for Raw Audio," arXiv: 1609.03499, 2016.
9. J. Donahue et al., "Adversarial Audio Synthesis," in Proc. ICLR, 2019.
10. K. Kumar et al., "MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis," in Proc. NeurIPS, 2019.
11. S. Pascual, A. Bonafonte, and J. Serra, "SEGAN: Speech Enhancement Generative Adversarial Network," in Proc. Interspeech, 2017, pp.
3642–3646.
12. A. Lavrentyeva et al., "STC Anti-Spoofing Systems for the ASVspoof 2019 Challenge," in Proc. Interspeech, 2019, pp. 1033–1037.
13. H. Tak et al., "End-to-End Anti-Spoofing with RawNet2," in Proc. IEEE ICASSP, 2021, pp. 6369–6373.
14. Y. Zhang, F. Jiang, and Z. Duan, "One-Class Learning Towards Synthetic Voice Spoofing Detection," IEEE Signal Processing Letters, vol.
28, 2021, pp. 937–941.
15. X. Liu, X. Wu, and H. Meng, "A Light CNN for Deepfake Speech Detection," in Proc. Interspeech, 2020, pp. 971–975.
16. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in Proc. IEEE CVPR, 2016, pp. 770–778.
17. S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, vol. 9, no. 8, 1997, pp. 1735–1780.
18. T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in Proc. ACM KDD, 2016, pp. 785–794.
19. L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, 2001, pp. 5–32.
20. C. Cortes and V. Vapnik, "Support-Vector Networks," Machine Learning, vol. 20, no. 3, 1995, pp. 273–297.
21. A. Vaswani et al., "Attention Is All You Need," in Proc. NeurIPS, 2017, pp. 5998–6008.
22. J. Villalba et al., "State-of-the-Art Speaker Recognition with Neural Network Embeddings," in Proc. IEEE ICASSP, 2020, pp. 7184–7188.
23. Z. Wu et al., "Spoofing and Countermeasures for Speaker Verification: A Survey," Speech Communication, vol. 66, 2015, pp. 130–153.
24. H. Delgado et al., "Further Investigations on Deepfake Speech Detection," in Proc. Interspeech, 2020, pp. 2987–2991.
25. J. Patino et al., "Deep Learning-Based Countermeasures for Anti-Spoofing," IEEE Trans. Inform. Forensics Security, vol. 17, 2022, pp. 280–
295
Related Articles
2026
Artificial Intelligence in Learning and Teaching
2026
Admin Assist: An AI – Driven Configuration and Orchestration for Enterprise Application
2026
Enhancing Blood Group Identification using pigeon inspired optimization: An Innovative Approach
2026
Eco-Genius: Power Up Smart, Power Down Waste
2026
Crowd-Sourced Disaster Response and Rescue Assistant
2026