javaaudiosignalsfrequencymodulation

Audio signal modulation to produce voice changing effects


As a study exercise I am trying to implement a java class that is able to apply some voice changes according to another audio effect file.

E.G. let's say I have a audio sample of my voice saying "hello world" and another audio sample of some "breathing noise", I would like to modulate the voice with the noise in order to achieve something like the "Darth Vader" effect.

Googling a bit I found that this could be achieved by using frequency modulation, so my first doubt: is frequency modulation the right answer to my problem? (I don't want to realize the darth vader voice effect, I want to make the voice sound like it was spoken with a generic noise effect).

Assuming that the frequency modulation is the proper answer, I tried to figure out how to implement it in java and ended up to something like that:

public void modulate(File voice, File effect, File output) {
   AmplitudeData ampVoice = readAudioFile(voice);
   AmplitudeData ampEffect = readAudioFile(effect);
   FFT fftVoice = FFT(ampVoice);
   FFT fftEffect = FFT(ampVoice);
   FFT fftModulated = FM(fftVoice,fftEffect);
   AmplitudeData ampModulated = IFFT(fftModulated);
   writeAmplitudeToFile(ampModulated, "WAV");
}

I basically know how to apply the FFT and IFFT but I am still looking for any stable and efficient open source code that may be better than mine, so just assume I am able to read an audio file (e.g. an MP3) into an amplitude representation and then produce the FFT representation of an audio file. Also the inverse FFT can be calculated.

Regarding the the FM (I am not an expert on signal processing), I found samples using a sin function which is pretty basic, but no example of using a different carrier (i.e. my noise effect).

By reading some papers I understood that signal masking is not what I am looking for. For example, to change the voice to a robotic sound or to a darth vader effect, I could just apply some shifting on the FFT, or some pitch changes, but in this case I want to let the voice look like it was spoken with another sound (e.g. imagine a chain saw or a burning fire saying something that resemble hello world).

So my question is what is the best and most efficient way to implement the FM function in my code? Would it work for my purpose?


Solution

  • A simplified version of the solution seems to be by applying a simple ring modulation of the carrier using a modulator signal.

    The main idea is like the "tremolo" effect, i.e. by simply multiplying the signal digital array with the tremolo varying:

    h[i] = inner_product(c[i],m[i])

    being H the final result, C the carrier and M the modulator, for each i where i is the index of each digital sample of each signal.

    In this version, the signals must be of the same length.

    The result may be affected by distortion but it should be ok for my purpose. If nobody else knows a better solution, I think this will be the correct answer.