I want to understand what happens to large amplitude values of a .wav
file, when I load them using librosa
.
I was trying to understand the amplitude values of .wav
files when I see the waveform using librosa
. Now, I want to see how scaling these values of amplitude affects the sound. Hence, I multiplied the values with a scaling factor. However, when I played that using IPython.display.Audio
, I was not able to see any effect on the sound:
scaled_signal = signal * 10 # signal is the original sample
# play the scaled signal
print('Play the scaled sample:')
display(Audio(data = scaled_signal, rate = sr))
So I saved the file to my PC and I could hear the difference. The amplitude was indeed scaled. Then, I decided to reload this file using librosa
. Surprisingly, now when I played this file again in my jupyter-notebook
, I was able to hear the effect of scaling:
soundfile.write('scaled_signal.wav', scaled_signal, sr)
# loading the scaled signal again
scaled_signal, sr = librosa.load('scaled_signal.wav', sr = sr)
print('The scaled sample loaded again')
display(Audio(data = scaled_signal, rate = sr))
However, on plotting the waveform (see below) I could see that its shape has changed. Help me understand what happened and why? It appears as if it applied an upper_bound on magnitude of amplitudes.
fig, axs = plt.subplots(1, 2)
fig.set_figwidth(18)
waveshow(signal, sr = sr, ax = axs[0])
waveshow(scaled_signal, sr = sr, ax = axs[1])
librosa.load()
does not apply any data-dependent normalization/scaling. Only mapping between int16/32 formats to a 0.0-1.0 range.
From the documentation for IPython.display.Audio, which you are using to play back the audio:
If the array option is used the waveform will be normalized.