I'm coming here because I don't understand my problem. I created a dockerfile + compose which creates 1 dask scheduler and 2 workers:
docker-compose.yaml:
version: '3.8'
services:
dask-scheduler:
build:
context: .
dockerfile: dask.Dockerfile
command: ["dask", "scheduler", "--host", "0.0.0.0"]
ports:
- "50101:8786"
- "50100:8787"
networks:
- default
dask-worker:
build:
context: .
dockerfile: dask.Dockerfile
command: ["dask", "worker", "dask-scheduler:8786", "--memory-limit", "4G"]
deploy:
mode: replicated
replicas: 2
networks:
- default
dask.Dockerfile
FROM python:3.11.0-bullseye
RUN apt update -y && \
apt upgrade -y
RUN apt-get install -y \
rustc \
libpq-dev
RUN pip install --upgrade pip
RUN pip install setuptools_rust
RUN pip install \
dask[complete] \
bokeh \
lz4
EXPOSE 8786
EXPOSE 8787
When I connect to the client from a Notebook, i have no problem. I can even run a test with: client.submit(np.random.random, 2903192, pure=False).key
But when I try to read_sql, the kernel crashes.
On the scheduler, I only get this:
dask-scheduler-1 | 2024-01-22 10:11:09,823 - distributed.scheduler - INFO - Receive client connection: Client-9063b1d4-b90e-11ee-9f28-a652689ec955
dask-scheduler-1 | 2024-01-22 10:11:09,824 - distributed.core - INFO - Starting established connection to tcp://192.168.65.1:56693
dask-scheduler-1 | 2024-01-22 10:11:12,921 - distributed.core - INFO - Connection to tcp://192.168.65.1:56693 has been closed.
dask-scheduler-1 | 2024-01-22 10:11:12,921 - distributed.scheduler - INFO - Remove client Client-9063b1d4-b90e-11ee-9f28-a652689ec955
dask-scheduler-1 | 2024-01-22 10:11:12,922 - distributed.scheduler - INFO - Close client connection: Client-9063b1d4-b90e-11ee-9f28-a652689ec955
Nothing is sent to any worker.
Here's the read_sql code:
df = dd.read_sql_table(
table_name="table",
index_col='stock_qty',
con="postgresql+psycopg2://username:password@IP:PORT/RAW"
)
Do you know what could be the problem?
I resolved the problem myself.
I created a new anaconda environment and the problem resolved by itself.