Spectrogram-Based Classification of Spoken Foul Language Using Deep CNN

Abdulaziz Saleh Ba Wazir, Hezerul Abdul Karim, Mohd Haris Lye Abdullah, Sarina Mansor, Nouar Aldahoul, Mohammad Faizal Ahmad Fauzi, John See

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)


Excessive content of profanity in audio and video files has proven to shape one's character and behavior. Currently, conventional methods of manual detection and censorship are being used. Manual censorship method is time consuming and prone to misdetection of foul language. This paper proposed an intelligent model for foul language censorship through automated and robust detection by deep convolutional neural networks (CNNs). A dataset of foul language was collected and processed for the computation of audio spectrogram images that serve as an input to evaluate the classification of foul language. The proposed model was first tested for 2-class (Foul vs Normal) classification problem, the foul class is then further decomposed into a 10-class classification problem for exact detection of profanity. Experimental results show the viability of proposed system by demonstrating high performance of curse words classification with 1.24-2.71 Error Rate (ER) for 2-class and 5.49-8.30 F1- score. Proposed Resnet50 architecture outperforms other models in terms of accuracy, sensitivity, specificity, F1-score.

Original languageEnglish
Title of host publication2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)
ISBN (Electronic)9781728193205
Publication statusPublished - 16 Dec 2020
Event22nd IEEE International Workshop on Multimedia Signal Processing 2020 - Virtual, Tampere, Finland
Duration: 21 Sept 202024 Sept 2020


Conference22nd IEEE International Workshop on Multimedia Signal Processing 2020
Abbreviated titleMMSP 2020
CityVirtual, Tampere


  • Censorship
  • CNN
  • Foul language
  • Spectrogram
  • Speech detection

ASJC Scopus subject areas

  • Signal Processing
  • Media Technology


Dive into the research topics of 'Spectrogram-Based Classification of Spoken Foul Language Using Deep CNN'. Together they form a unique fingerprint.

Cite this