Using Neural Nets for Audio Events Detection

Audio events detection as the name suggests is the task of detecting 1 or more audio events in an audio clip of a certain duration. In this post, we limit our discussion to 1 audio event in an audio clip of a fixed duation of 4 seconds.

This is for example an audio clip with the audio event traffic

The Idea

Literature has a well established pipeline for the task:

State-of-the-art configuration before neural network era looked like :

Feature transformation in the above configuration has particularly been a problem for reasons like this one:

That’s where 1-D Convolutional Neural Networks could be a game changer :

1-D kernels could bring out the most important frames that would survive the max-pool reduction layer :


In case you were wondering, Fully-Connected Nets would suffer from the same feature-transformation limitation as the traditional methods, and it was tested to perform mcuh worse compared to a CNN model.

Dataset

UrbanSounds8K dataset is a monophonic sounds dataset i.e. each audio clip consists just one audio event (labelled). This dataset was used for the audio events detection task, providing with 8732 data samples and consisting of 10 output classes (audio events). Those are:

  • Air Conditioner Sound
  • Car Horn
  • Children Playing
  • Dog Bark
  • Drilling
  • Engine Idling
  • Gun Shot
  • Jackhammer
  • Siren
  • Street Music

Feature Engineering

Mel Frequency Cepstrum Coefficients (MFCCs) are renowned features for audio-related tasks.

Calculating MFCCs include a number of steps as shown below

Machine Learning Model

Now that we have the data, features transformed, we can finally talk about the whole machine learning pipeline:

Results

Not so surprisingly, CNN outperforms DNN outperforms kNN outperforms Random Choice Classifier:

Conclusion

CNNs are not so impactful for no reason. For this particular task, we saw NNs especially CNNs outperforming classical ML model(s). This project was a part of my Masters Thesis :)

Written on May 13, 2018