matlabaudiosignal-processingnoisepitch-shifting

Matlab: processing audio signal in very small frame sizes makes the audio disappear completely


I am writing a pitch adaptation function in matlab. It takes an audio signal and a pitchCoefficient vector, where each element determines by how much to pitch shift its respective frame.

The audio signal is sliced evenly depending on how many pitch coefficients there are. If there are only 2 pitch coefficients the audio will be divided into 2 halves, and the first half will be pitch shifted by the first coefficient and the second half will be pitch shifted by the second coefficient. So if my coefficients are [1,2] the first half of the audio signal will sound the same as the original and the second half of the audio will be twice as high pitched.

This is the code for my function:

function [audioModified] = modifyPitch(audio, pitchCoefficients)

nwindows = length(pitchCoefficients);
windowSize = floor(length(audio)/nwindows);
audioModified = [];
for i=1:nwindows
    start = (i-1)*windowSize + 1;
    finish = i*windowSize;
    originalWindow = audio(start:finish, 1);
    pitchCoeff = 1/pitchCoefficients(i);
    timeScaledWindow = pvoc(originalWindow, pitchCoeff);
    [P,Q] = rat(pitchCoeff);
    pitchModifiedWindow = resample(timeScaledWindow, P, Q);
    audioModified = [audioModified; pitchModifiedWindow];
end

end

However, the final audio (which is the concatenation of all the frames) has these artifacts where each frame starts with a 'tic' sound. I'm assuming this happens because of the way I concatenate the frames. If the frames are too small this effect becomes so pronounced that the audio is no longer hearable.

How should I go about mitigating or removing this problem? Is there a way to smooth the audio out the same way you can blur an image to get rid of noise?

Additional info: I use this phase vocoder (pvoc) to do the time scaling.


Solution

  • Try overlapping and cross-fading longer frames instead of using frames that are too small. The cross fade will help reduce the discontinuity between adjacent (re)synthesized frames.