Silence Trimmer — Your first speech/audio processing exercise in Python
One great thing in Python is — there are so many options to do one thing, which again is a curse sometimes.
For example, consider a case when you have to perform a series of simple operations on some audio dataset — trimming silence, calculating their average duration, and then making them all of the equal length same as the average — trimming the bigger ones, and padding the smaller ones. There are many many cool libraries available that serve some or all of these purposes. For a beginner, it creates confusion about which one to use for what.
I will show one of the many ways to trim silence from an audio file, which you can start your audio journey with. By start playing with it, you can gradually try the advanced features.
The Code
I always like to give the full code first, and then explain. Clone this small project and build by instruction in README.md, or simply follow me step by step.
Run this file. But before that, put this wav file in the same directory. Also, you need to install librosa and soundfile
pip install librosa
pip install soundfile
It will trim the silence and save the audio as speech-trimmed.wav. Notice that, by trim, it means remove silence at the beginning and end — not the silence in the middle.
Explanation
If you play speech.wav, notice that there is a little silence at the beginning and end.
Now see line#4 in the code
This line loads wave data.
Sample rate is number of samples per second. Meaning,
sample_rate = number of samples / duration
So,
sample_rate = len(waveform) / librosa.get_duration(waveform)
This line simply trims silence in the waveform data. The top_db parameter is important here. It means any sound below this decibel will be considered as silence and so will be trimmed. Meaning — the lower the value, most likely the lower the duration of the resulting audio. The function returns a tuple of actual data and length. We just need the data and so the [0] in the end.
Finally,
This line is saving the audio.
What to do now
How about these exercises?
- For different files, set different top_db and notice the difference in the result
- Plot the waveform using matplotlib before and after trimming
- Play before and after trimming and verify. Simpleaudio is of the many packages for this purpose.