pythonperformanceuploadwebdavmultiple-file-upload

How to make uploading multiple files with Webdav faster?


Currently, I am developing a script that involves uploading using WebDAV. I am able to make (empty) directories and upload files fine. However, I have not been able to find a way to upload an entire directory or multiple files at once.

So, I have to upload a directory by making each individual parent directory and uploading each file one by one. The more files in a directory, the longer it takes. Using the script below, it takes me around 5 minutes to upload a 100 megabyte git project.

Importantly, a small quantity of bigger files upload at a much faster speed than a large quantity of small files. Unfortunately, I can't decompress a file on the website I am uploading too, and I don't think most WebDAV applications support that either, or I would just upload a tarball.

So I was wondering is there any way to upload multiple files faster using WebDAV?

#!/usr/bin/env python
import os, sys, subprocess  
import webdav4.client

mycon = webdav4.client.Client("https://example.com", \
                        auth=("username", "password"))
dicti = {}

# Find directories within path
dicti["dirs"] = subprocess.run(\
        ('find', '-type', 'd', '-print0'),\
        capture_output=True, text=True).stdout.split('\0')

# Find files within path
dicti["files"] = subprocess.run(\
        ('find', '-type', 'f', '-print0'),\
        capture_output=True, text=True).stdout.split('\0')

for ky in ('dirs', 'files'):
    for i in dicti[ky]:
        if not mycon.exists('dest/' + i):
            if ky == "dirs":
                print(mycon.mkdir('dest/' + i))
            else:
                print(mycon.upload_file(i, 'dest/' + i))

Solution

  • It's a same as for a regular http requests. You can call them in parallel, then you need to use Connection: keep-alive or use HTTP2.

    The WebDAV doesn't support bulk uploads on a specification level. So diferent syncing tools use own schemas. You can check NextCloud API for chunked uploads.