I have a task of emulating 8k landline/cellular/VoIP speech audio given 16k microphone recording of that speech. What are the main stages for emulating it? I've found this torchaudio tutorial on such augmentation, and it is the most detailed instructions on how to do it.
Finaly I see following 16k mic -> 8k tel conversion pipeline:
What should be added? Equalization, some special filters, packet loss concealment emulation? May be there is existing Matlab scripts or libs for such augmentation?
Assuming you have a wave file
from scipy.signal import lfilter, butter
from scipy.io.wavfile import read,write
from numpy import array, int16
def butter_params(low_freq, high_freq, fs, order=5):
nyq = 0.5 * fs
low = low_freq / nyq
high = high_freq / nyq
b, a = butter(order, [low, high], btype='band')
return b, a
def butter_bandpass_filter(data, low_freq, high_freq, fs, order=5):
b, a = butter_params(low_freq, high_freq, fs, order=order)
y = lfilter(b, a, data)
return y
def apply_telephony_effect(f1, f2):
fs,audio = read(f1)
low_freq = 300.0
high_freq = 3000.0
filtered_signal = butter_bandpass_filter(audio, low_freq, high_freq, fs, order=6)
write(f2,fs,array(filtered_signal,dtype=int16))
you can create another
apply_telephony_effect('input.wav', 'output.wav')
The output will sound like telephone.