pythondaskdask-distributeddask-kubernetes

Dask Distributed: How to delete uploaded files from the cluster


I wanted to know if there's a function in dask.distributed that removes the files uploaded to the cluster using the client.upload_file()?

Basically, the opposite of the upload_file() function. best regards


Solution

  • There's no function to do this directly as of today. But as a work-around, you can use client.run with os.remove:

    client.run(lambda dask_worker: os.remove(os.path.join(dask_worker.local_directory, "file.py")))
    

    where file.py can be replaced by the name of the file that you uploaded.

    Note that the input argument to client.run must be dask_worker, relevant docs here: https://distributed.dask.org/en/latest/api.html#distributed.Client.run