python-3.xdjangozipstream

Getting corrupt zips using Python3 ZipStream in Django


I'm using zipstream from here and have a Django view that returns a zip file of all file attachments which are all hosted on Amazon S3. But the zip files are all coming up as corrupt when I download them, that is, I can't open them.

import io
import zipstream
s = io.BytesIO()
with zipstream.ZipFile(s,"w", compression=zipstream.ZIP_DEFLATED) as zf:
    for file_path in file_paths:
        file_dir, file_name = os.path.split(file_path)

        zf.writestr(file_name, urllib.urlopen(file_path).read())

response = StreamingHttpResponse(s.getvalue(), content_type='application/octet-stream')
response['Content-Disposition'] = 'attachment; filename={}'.format('files.zip')
return response

Solution

  • Instead of zipstream package install aiozipstream package. If you've alreday installed the zipstream package uninstall it first. pip uninstall zipstream and then do a pip install aiozipstream

    #Do the import in the same way
    from zipstream import ZipStream
    from django.http import StreamingHttpResponse
    
    def s3_content_generator(key):
        #s3_bucket - your s3 bucket name
        infile_object = s3.get_object(Bucket=s3_bucket, Key= key)
        data = str(infile_object.get('Body', '').read())
        yield bytes(data, 'utf-8')
    
    
    files = []
    #filepath - list of s3 keys
    for keyin filepath:
        files.append({'stream': s3_content_generator(key),'name': 'filename'})
    
    #large chunksize fasten the download process
    zf = ZipStream(files, chunksize=32768)
    response = StreamingHttpResponse(zf.stream(), content_type='application/zip')
    response['Content-Disposition'] = 'attachment; filename={}'.format('files.zip')
    return response