pythonpython-3.xnumpylow-memory

How can I save a big `numpy` as '*.npz' array with limited filesystem capacity?


I have a numpy array which saved as an uncompressed '*npz' file is about 26 GiB as it is numpy.float32 and numpy.savez() ends with:

OSError: Failed to write to /tmp/tmpl9v3xsmf-numpy.npy: 6998400000 requested and 3456146404 written

I suppose saving it compressed may save the day, but with numpy.savez_compressed() I have also:

OSError: Failed to write to /tmp/tmp591cum2r-numpy.npy: 6998400000 requested and 3456157668 written

as numpy.savez_compressed() saves the array uncompressed first.

The obvious "use additional storage" I do not consider an answer. ;)

[EDIT]

The tag low-memory refers to disk memory, not RAM.


Solution

  • With the addition of ZipFile.open(..., mode='w') in Python 3.6, you can do better:

    import numpy as np
    import zipfile
    import io
    
    def saveCompressed(fh, **namedict):
         with zipfile.ZipFile(fh, mode="w", compression=zipfile.ZIP_DEFLATED,
                              allowZip64=True) as zf:
             for k, v in namedict.items():
                 with zf.open(k + '.npy', 'w', force_zip64=True) as buf:
                     np.lib.npyio.format.write_array(buf,
                                                     np.asanyarray(v),
                                                     allow_pickle=False)