pythonhttptornadourllib2

How can I download multiple files using urllib2 in a single http request instead of one request per file?


I am using tornado to host a server that serves as the backend for a client that needs to be running Jython 2.5.2 (thus urllib2). I have a function that downloads files, that up until now only downloaded text files. I need to add non-text files and download them quickly.

To download text files in a timely fashion, I concatenate them into a single string and send them as plain text to the client.

class DownloadHandler(web.RequestHandler):
    def get(self):
        files_as_text = ""
        for file in os.listdir("files"):
            files_as_text += file+"---title_split---"+open("files/"+file).read()+"---file_split---"
        self.write(files_as_text)

Then, on the other side, I can build them back into files.

for uptodate_file in uptodate_files.split("---file_split---")\[:-1\]:
    uptodate_filename, uptodate_file_contents = uptodate_file.split("---title_split---")
    new_file = open(dir+uptodate_filename, 'wb')
    new_file.write(uptodate_file_contents).read())
    new_file.close()

This worked great for text files, but when I add any other file into the mix, it no longer works. So, I served each file separately and made individual requests using

urllib2.urlopen(url+uptodate_filename).read()

This works, but it is really slow. Is there some way to combine the two and send concatenated files that are not text?


Solution

  • Turns out you can just stick a b in front of each string to convert it to bytes and everything works great!

    class DownloadHandler(web.RequestHandler): 
        def get(self): 
            files_as_text = b"" 
            for file in os.listdir("files"): 
                files_as_text += file.encode("utf-8")+b"---title_split---"+open("files/"+file, "rb").read()+b"---file_split---" 
            self.write(files_as_text)