djangodownloadzipstreaming

What will happen if the reported size of a streaming download is inaccurate?


I implemented a download view in my django project that builds a zip archive and streams it on the fly.

The files included in the archive are 1 tsv file and any number of xml (from a set of search results) that are organized into a series of directories.

The download works, but there is no progress. We have tested a small download (47Mb) and a large one (3Gb).

I was thinking that it would be nice to have a progress bar to give the user some idea of how long the download will take, however, from what I've read, predicting the size of a zip file is tricky/prone-to-inaccuracy, so I'm wondering (since I'm very inexperienced with zip file generation [let alone streaming downloads])...

Are there any alternate solutions for this problem space that I should consider?


Solution

  • To have a progress, you need to send Content-Length header in the response and you can't send that with streaming requests as you don't know the exact size of the response before start streaming.

    OK, so what happens if we estimate Content-Length:

    The solution is to do all the work first on the server, so you send the file all at once (with the content length set probably), but for sure, it can cause Gateway Timeout, if you are compressing for a long time.