pythonaudionoisenoise-generator

Mix second audio clip at specific SNR to original audio file in Python


I have two audio files that I would like to mix in Python.

I'll refer to the original audio as "Audio A" and the audio to mix is "Audio B". I am able to add white noise at a specific SNR to an the Audio A signal as described here:

audio, sr = librosa.load(file_name, sr=None, res_type='kaiser_fast')
power = audio ** 2  # Calculate power
signalpower_db = 10 * np.log10(power)  # convert power to dB
#snr_dB = 0  # add SNR of specified dB
signal_average_power = np.mean(power)  # Calculate signal power)
signal_averagepower_dB = 10 * np.log10(signal_average_power)  # convert signal power to dB
noise_dB = signal_averagepower_dB - snr_dB  # Calculate noise
noise_watts = 10 ** (noise_dB / 10)  # Convert noise from dB to watts
# Generate sample of white noise
mean_noise = 0
noise = np.random.normal(mean_noise, np.sqrt(noise_watts), len(audio))

noise_signal = (audio + noise) / 1.3  #  To prevent clipping of signal

In the code given here, I have done a 0 dB SNR.

How do I now, instead of white noise, use "Audio B" as the noise source and obtain a SNR of 0dB. That is, how to replace np.random.normal noise with Audio B as the noise source injected in the original signal "Audio A" at a SNR = 0?

Any help and guidance is sincerely appreciated!


Solution

  • You have to make sure that the two audios have the same duration and then you can calculate what is the gain that gives the desired SNR. If you simply add then you will change the energy of the input signal, so I adjust the signal both the signal energy and the noise energy so that the noisy signal has the same energy as the clean signal (assuming that noise is uncorrelated)

    def mix_audio(signal, noise, snr):
        # if the audio is longer than the noise
        # play the noise in repeat for the duration of the audio
        noise = noise[np.arange(len(signal)) % len(noise)]
        
        # if the audio is shorter than the noi
        # this is important if loading resulted in 
        # uint8 or uint16 types, because it would cause overflow
        # when squaring and calculating mean
        noise = noise.astype(np.float32)
        signal = signal.astype(np.float32)
        
        # get the initial energy for reference
        signal_energy = np.mean(signal**2)
        noise_energy = np.mean(noise**2)
        # calculates the gain to be applied to the noise 
        # to achieve the given SNR
        g = np.sqrt(10.0 ** (-snr/10) * signal_energy / noise_energy)
        
        # Assumes signal and noise to be decorrelated
        # and calculate (a, b) such that energy of 
        # a*signal + b*noise matches the energy of the input signal
        a = np.sqrt(1 / (1 + g**2))
        b = np.sqrt(g**2 / (1 + g**2))
        print(g, a, b)
        # mix the signals
        return a * signal + b * noise
    

    Here an example how you can use the function

    signal = np.random.randint(0, 2, 10**7) - 0.5
    # use some non-standard noise distribution
    noise = np.sin(np.random.randn(6*10**7))
    noisy = mix_audio(signal, noise, 10)
    plt.hist(noisy, bins=300);