pythonmemory-managementscientific-computingreference-countingpyfftw

pyfftw release references to arrays without destroying plan


I have a large set of large arrays that need to be fourier transformed one after another, repeatedly, and they do not all fit in memory at the same time. Typical array size is (350,250000), but is quite variable. The general procedure is

while True:
    for data in data_set:
        array  = generate_array(data)
        fft(array,farray)
        do_something_with_farray()
        ifft(farray,array)
        do_something_with_array()

This needs to be fast, so ideally I would make plans for all the arrays beforehand, and reuse them in the loop. This is especially important because even constructing a plan with FFTW_ESTIMATE is too slow for me to do it inside the loop (10x+ times slower than just executing the plan, when constructing it as pyfftw.FFTW(array, farray, flags=['FFTW_ESTIMATE,FFTW_DESTROY_INPUT'], threads=nthread, axes=[-1])). However, each plan contains a reference to the arrays that were used when constructing it, which means that keeping all the plans in memory results in me keeping all the arrays in memory too, which I can't afford.

Is it possible to make pyfftw release the references it holds to the arrays? After all, I am planning to repoint them to fully compatible new arrays inside the loop anyway. If not, is there some other way of getting around this problem? I guess I could make plans for single rows, or for chunks of rows, but that could easily lead to slowdowns.

PS. I use FFTW_ESTIMATE rather than FFTW_MEASURE despite planning to reuse the plan many times becuse FFTW_MEASURE takes forever for these array sizes, and when I specify a time limit, performance is no better than for FFTW_ESTIMATE.

Edit: Actually, the slowness of constructing the plan only happens the first time I construct a plan of that shape (due to wisdom, I guess), so the approach of not storing the plans works after all. Still, if it is possible to store plans without the array references, that would be nice to know about.


Solution

  • FFTW plans are by there nature tied to a piece of memory. However, there is nothing to stop you using the same piece of memory for all your plans. So you could create a single array that is big enough for all your possible arrays and then create your FFTW objects on views into that array.

    You can then execute the FFT using the FFTW.__call__() interface that allows the arrays to be updated prior to execution (with little overhead when they agree with the original array in strides and alignment).

    Now, the FFTW object will have the new arrays as its internal arrays. If you want to revert back to the other memory, you can use FFTW.update_arrays().