I am trying to parallelize the FFT transforms of an acoustic fingerprinting library known as Chromaprint. It works by "splitting the original audio into many overlapping frames and applying the Fourier transform on them." Chromaprint uses a frame size of 4096, with a 2/3 overlap. For instance, the first frame consists of elements [0...4095], then the second frame is something like [1366.. 5462].
With cufftPlanMany, I know that you can specify batches of size 4096, that will perform batches of [0... 4095], [4096... 8192], etc. Is there some way to make the batched transforms overlap, or should I consider another approach that doesn't use batched execution?
If you use Advanced Data Layout, the idist
parameter should allow you to set any arbitrary offset between the starting points of 2 successive transform input sets.
For the 1D case, the input will be selected according to the following based on the parameters you pass:
input[ b * idist + x * istride]
(where b
is the batch number currently being processed, i.e. b = 0, 1, 2, ... batch size)
"The idist and odist parameters indicate the distance between the first element of two consecutive batches in the input and output data."