Let me begin this topic by stating that I am a total newbie to WebRtc and if any I mention anything half-witty, please bear with me in a forgivable manner.
I am writing an app, that does echo cancellation performance comparison between Speex and Web RTC AEC3. [WebRtc AEC3 code base(Newest Branch): https://webrtc.googlesource.com/src/+/branch-heads/72]
The app reads WAV files and feeds the samples to the AEC module, and a WAV writer saves the output of Echo cancellation,
I have 2 inputs: 1) Speaker Input or Rendered Signal or FarEnd Signal 2) MicInput or Captured Signal or NearEnd Signal
And one Output: 1) MicOutput- Which is the result of Echo cancellation.
Now for Speex modules, I see a well behaved manner. Please have a look at the following file, its doing a good job in cancelling the rendered signal from Captured Signal.
However, when I am passing the same files with WebRtc Aec3, I am getting a flat out signal. Below is the result of AEC3.
It seems like it is also cancelling out the original mic signal too.
I am using the following parameters(extracted from Wav file reader): Sample rate : 8000 Channel : 1 Bits/Sample : 16 Number Of Samples : 270399 Samples Fed to AEC at a time : (10 * SampleRate)/1000 = 80
This is the initialization:
m_streamConfig.set_sample_rate_hz(sampleRate);
m_streamConfig.set_num_channels(CHANNEL_COUNT);
// Create a temporary buffer to convert our RTOP input audio data into the webRTC required AudioBuffer.
m_tempBuffer[0] = static_cast<float*> (malloc(sizeof(float) * m_samplesPerBlock));
// Create AEC3.
m_echoCanceller3.reset(new EchoCanceller3(m_echoCanceller3Config, sampleRate, true)); //use high pass filter is true
// Create noise suppression.
m_noiseSuppression.reset(new NoiseSuppressionImpl(&m_criticalSection));
m_noiseSuppression->Initialize(CHANNEL_COUNT, sampleRate);
And this is how I am calling the APIs:
auto renderAudioBuffer = CreateAudioBuffer(spkSamples);
auto capturedAudioBuffer = CreateAudioBuffer(micSamples);
// Analyze capture buffer
m_echoCanceller3->AnalyzeCapture(capturedAudioBuffer.get());
// Analyze render buffer
m_echoCanceller3->AnalyzeRender(renderAudioBuffer.get());
// Cancel echo
m_echoCanceller3->ProcessCapture(
capturedAudioBuffer.get(), false);
// Assuming the analog level is not changed.
//If we want to detect change, need to use gain controller and remember the previously rendered audio's analog level
// Copy the Captured audio out
capturedAudioBuffer->CopyTo(m_streamConfig, m_tempBuffer);
arrayCopy_32f(m_tempBuffer[0], micOut, m_samplesPerBlock);
And also regarding the parameters (delay, echoModel, reverb, noisefloor etc.) , I am using all default values.
Can anyone tell me what I am doing wrong? Or how can I make it better by adjusting the appropriate parameters?
UpDate:(02/22/2019) Figured out why is Echo Output muted. Seems like Webrtc AEC3 cannot process 8k and 16k sampling rate, although in source code there are indication they support 4 different sampling rate: 8k, 16k, 32k and 48k. I got an echo cancelled output after I gave input of 32k and 48k Samples. However, I do not see any echo cancellation. It just spits out the exact samples as it was fed in for NearEnd/Mic/Captured input. So yeah, probably I am missing key parameter settings. Still looking for help.
the most important is the thing called "delay", you can find the definition of it in audio_processing.h
Sets the |delay| in ms between ProcessReverseStream() receiving a far-end frame and ProcessStream() receiving a near-end frame containing the corresponding echo. On the client-side this can be expressed as delay = (t_render - t_analyze) + (t_process - t_capture) where,
- t_analyze is the time a frame is passed to ProcessReverseStream() and
t_render is the time the first sample of the same frame is rendered by
the audio hardware.
- t_capture is the time the first sample of a frame is captured by the
audio hardware and t_process is the time the same frame is passed to
ProcessStream().
2 . EchoCanceller3 delay
SetAudioBufferDelay(int dealy);