I tried to run a couple of scrapyd services to have a simple cluster on my localhost, but only the first node works. For 2 others I get the following error
scrapydweb_1 | [2020-11-17 07:17:32,738] ERROR in scrapydweb.utils.check_app_config: HTTPConnectionPool(host='scrapyd_node_3', port=6802): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb060b8ef50>: Failed to establish a new connection: [Errno 111] Connection refused'))
scrapydweb_1 | [2020-11-17 07:17:32,738] ERROR in scrapydweb.utils.check_app_config: HTTPConnectionPool(host='scrapyd_node_2', port=6801): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb060a1e650>: Failed to establish a new connection: [Errno 111] Connection refused'))
I have the following docker-compose.yml
file:
version: '3'
services:
scrapyd_node_1:
build:
context: .
dockerfile: ./crawlers/scrapyd/Dockerfile
ports:
- "6800:6800"
volumes:
- ./data:/var/lib/scrapyd
- ./data/results:/app/results
restart: unless-stopped
scrapyd_node_2:
build:
context: .
dockerfile: ./crawlers/scrapyd/Dockerfile
ports:
- "6801:6800"
volumes:
- ./data:/var/lib/scrapyd
- ./data/results:/app/results
restart: unless-stopped
scrapyd_node_3:
build:
context: .
dockerfile: ./crawlers/scrapyd/Dockerfile
ports:
- "6802:6800"
volumes:
- ./data:/var/lib/scrapyd
- ./data/results:/app/results
restart: unless-stopped
scrapydweb:
build:
context: .
dockerfile: ./crawlers/scrapydweb/Dockerfile
environment:
USERNAME: "test"
PASSWORD: "test"
SCRAPYD_SERVERS: "scrapyd_node_1:6800,scrapyd_node_2:6801,scrapyd_node_3:6802"
links:
- scrapyd_node_1
- scrapyd_node_2
- scrapyd_node_3
ports:
- "5000:5000"
depends_on:
- scrapyd_node_1
- scrapyd_node_2
- scrapyd_node_3
restart: unless-stopped
What is wrong with my docker-compose file?
The problem is in line:
SCRAPYD_SERVERS: "scrapyd_node_1:6800,scrapyd_node_2:6801,scrapyd_node_3:6802"
Try changing it to:
SCRAPYD_SERVERS: "scrapyd_node_1:6800,scrapyd_node_2:6800,scrapyd_node_3:6800"
Explanation:
When you defined you docker service scrapyd_node_2
for instance, you defined ports to be:
ports:
- "6801:6800"
It means, that port 6800
from contanier is mapped to port 6801
on your host machine. Hence, when you want to declare node with hostname scrapyd_node_2
, you should use it's port = scrapyd_node_2:6800
.