In dask's LocalCluster, there is a parameter memory_limit
. I can't find in the documentation (https://distributed.dask.org/en/latest/worker.html#memory-management) details about whether the limit is per worker, per thread, or for the whole cluster. This is probably at least in part because I have trouble following how keywords are passed around.
For example, in this code:
cluster = LocalCluster(n_workers=2,
threads_per_worker=4,
memory_target_fraction=0.95,
memory_limit='32GB')
will that be 32 GB for each worker? For both workers together? Or for each thread?
My question is motivated partly by running a LocalCluster
with n_workers=1
and memory_limit=32GB
, but it gets killed by the Slurm Out-Of-Memory killer for using too much memory.
The memory_limit
keyword argument to LocalCluster sets the limit per worker.
Related documentaion: https://github.com/dask/distributed/blob/7bf884b941363242c3884b598205c75373287190/distributed/deploy/local.py#L76-L78
Note, if the memory_limit given is greater than the available memory, the total available memory will be set for each worker. This behavior hasn't been documented yet, but a relevant issue is here: https://github.com/dask/dask/issues/8224
Screenshot of cluster with code:
from dask.distributed import LocalCluster, Client
cluster = LocalCluster(n_workers=2,
threads_per_worker=4,
memory_target_fraction=0.95,
memory_limit='8GB')
client = Client(cluster)
client