pythonnumpypython-zipfile

How do I know that a .npz file was compressed or not?


Given a .npz file from np.savez or np.savez_compressed, when it got loaded by np.load, is there any way to check whether the file was compressed or not?

I tried to look though docs and GitHub. It didn't tell me anything but how the file is compressed.


Solution

  • np.load returns a NpzFile object.

    If np.savez was used, the compression type is ZIP_STORED, if np.savez_compressed was used, the compression type is ZIP_DEFLATED (relevant source code).

    So to wrap it up:

    import numpy
    import zipfile
    
    def is_compressed(npz_file):
        zip_infos = npz_file.zip.infolist()
        if len(zip_infos) == 0:
            raise RuntimeError("Did not find ZipInfos unexpectedly")
        compress_type = zip_infos[0].compress_type
        if compress_type == zipfile.ZIP_STORED:
            return False
        elif compress_type == zipfile.ZIP_DEFLATED:
            return True
        else:
            raise ValueError("Unexpected compression type")
    
    # Example
    a = numpy.array([1, 2, 3])
    numpy.savez("uncompressed.npz", a)
    numpy.savez_compressed("compressed.npz", a)
    u = numpy.load("uncompressed.npz")
    c = numpy.load("compressed.npz")
    print(is_compressed(u))  # False
    print(is_compressed(c))  # True