pythonkubernetesdaskdask-kubernetes

dask kubernetes import local library


When working on a local project, from local_project.funcs import local_func will fail in the cluster because local_project is not installed.

This forces me to develop everything on the same file.

Solutions? Is there a way to "import" the contents of the module into the working file so that the cluster doesn't need to import it?

Installing the local_project in the cluster is not development friendly because any change in an imported feature requires a cluster redeploy.

import dask
from dask_kubernetes import KubeCluster, make_pod_spec
from local_project.funcs import local_func

pod_spec = make_pod_spec(
    image="daskdev/dask:latest",
    memory_limit="4G",
    memory_request="4G",
    cpu_limit=1,
    cpu_request=1,
)
cluster = KubeCluster(pod_spec)

df = dask.datasets.timeseries()
df.groupby('id').apply(local_func)  #fails if local_project not installed in cluster


Solution

  • Typically the solution to this is to make your own docker image. If you have only a single file, or an egg or zip file then you might also look into the Client.upload_file method