pythonurlliblarge-file-upload

Large file(about 3GB) upload with urllib / sock.sendall(data) OSError


env: Mac OS X El Capitan / python 3.5.1

I want to upload file which is about 3GB size.

def read_in_chunks(file_object, chunk_size=4096):
    while True:
        data = file_object.read(chunk_size)
        if not data:
            break
        yield data

with open('3GB.mov', 'br') as f:
    data = b''.join([chunk for chunk in read_in_chunks(f)])

req = urllib.request.Request(url, data, headers)
response = urllib.request.urlopen(req)
the_page = response.read()

The Problem is ..

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 578, in urlopen
    chunked=chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 362, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1083, in request
    self._send_request(method, url, body, headers)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1128, in _send_request
    self.endheaders(body)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1079, in endheaders
    self._send_output(message_body)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 913, in _send_output
    self.send(message_body)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 885, in send
    self.sock.sendall(data)
OSError: [Errno 22] Invalid argument

Could you give me some advice?


Solution

  • import requests
    url = 'http://domain.com/api/upload'
    with open('3GB.mov', 'br') as f:
        for chunk in read_in_chunks(f):
    
            offset = index + len(chunk)
            headers['Content-Type'] = 'application/octet-stream'
            headers['Content-length'] = content_size
            headers['Content-Range'] = 'bytes %s-%s/%s' % (index, offset, content_size)
            index = offset
            try:
                r = requests.post(url, data=chunk, headers=headers)
                print "r: %s, Content-Range: %s" % (r, headers['Content-Range'])
            except Exception, e:
                print e