I have a portainer and traefik infrastructure on an ionos VPS. Suddenly overnight it broke. I can't access any dashboard or service on the server.
Investigating further led me to believe it probably is an issue with the certificate generation but I can't say for sure, if it's the only problem.
So I haven't touched it in a few months and now it's not working anymore. To be clear the docker-compose.yml in this post is slightly edited, I changed from latest for traefik to v3. Other than that it's the exact same as the one that has been running for over half a year.
So what have I tried already:
If anyone has any ideas on what the problem might be, I'm very open for ideas or solutions.
services:
traefik:
container_name: traefik
image: "traefik:v3"
healthcheck:
test: ["CMD", "traefik", "healthcheck", "--ping"]
interval: 10s
timeout: 5s
retries: 3
start_period: 15s
restart: always
command:
- --providers.docker
- --providers.docker.exposedbydefault=false
- --providers.docker.network=traefik-proxy
- --entrypoints.web.address=:80
- --entrypoints.websecure.address=:443
- --api.dashboard=true
- --ping=true
- --providers.docker
- --log.level=ERROR
- --certificatesresolvers.leresolver.acme.httpchallenge=true
- --certificatesresolvers.leresolver.acme.email=notyourproblem
- --certificatesresolvers.leresolver.acme.storage=./acme.json
- --certificatesresolvers.leresolver.acme.httpchallenge.entrypoint=web
ports:
- "80:80"
- "443:443"
volumes:
- "/var/run/docker.sock:/var/run/docker.sock:ro"
- "./acme.json:/acme.json"
labels:
- "traefik.enable=true"
- "traefik.docker.network=traefik-proxy"
- "traefik.http.routers.dashboard.rule=Host(`nope`) && (PathPrefix(`/shush`) || PathPrefix(`/nope`))"
- "traefik.http.routers.dashboard.entrypoints=websecure"
- "traefik.http.routers.dashboard.service=api@internal"
- "traefik.http.routers.dashboard.middlewares=auth"
- "traefik.http.services.dashboard.loadbalancer.server.port=8080"
- "traefik.http.routers.http-catchall.rule=hostregexp(`{host:.+}`)"
- "traefik.http.routers.http-catchall.entrypoints=web"
- "traefik.http.routers.http-catchall.middlewares=redirect-to-https"
- "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"
- "traefik.http.middlewares.auth.basicauth.users=td3v:noneofyourconcern"
- "traefik.http.routers.dashboard.tls.certresolver=leresolver"
networks:
- traefik-proxy
portainer:
container_name: portainer
image: portainer/portainer-ce:latest
command: -H unix:///var/run/docker.sock
restart: always
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- portainer_data:/data
labels:
- "traefik.enable=true"
- "traefik.docker.network=traefik-proxy"
- "traefik.http.routers.frontend.rule=Host(`nope`)"
- "traefik.http.routers.frontend.entrypoints=websecure"
- "traefik.http.services.frontend.loadbalancer.server.port=9000"
- "traefik.http.routers.frontend.service=frontend"
- "traefik.http.routers.frontend.tls.certresolver=leresolver"
- "traefik.http.routers.edge.rule=Host(`nope`)"
- "traefik.http.routers.edge.entrypoints=websecure"
- "traefik.http.services.edge.loadbalancer.server.port=8000"
- "traefik.http.routers.edge.service=edge"
- "traefik.http.routers.edge.tls.certresolver=leresolver"
networks:
- traefik-proxy
- internal
volumes:
portainer_data:
networks:
traefik-proxy:
external: true
internal:
external: false
Network list of docker
Firewall rules in the ionos panel
logs:
Creating network "portainer-traefik_internal" with the default driver
Creating traefik ... done
Creating portainer ... done
Attaching to traefik, portainer
portainer | 2024/11/07 03:37PM INF github.com/portainer/portainer/api/cmd/portainer/main.go:369 > encryption key file not present | filename=portainer
portainer | 2024/11/07 03:37PM INF github.com/portainer/portainer/api/cmd/portainer/main.go:392 > proceeding without encryption key |
portainer | 2024/11/07 03:37PM INF github.com/portainer/portainer/api/database/boltdb/db.go:125 > loading PortainerDB | filename=portainer.db
portainer | 2024/11/07 03:37PM INF github.com/portainer/portainer/api/chisel/service.go:198 > Found Chisel private key file on disk | private-key=/data/chisel/private-key.pem
portainer | 2024/11/07 15:37:44 server: Reverse tunnelling enabled
portainer | 2024/11/07 15:37:44 server: Fingerprint IShTAt+SYsnH0uQO8hqboTUl+fzRKtZT3mgqg33mF3k=
portainer | 2024/11/07 15:37:44 server: Listening on http://0.0.0.0:8000
portainer | 2024/11/07 03:37PM INF github.com/portainer/portainer/api/cmd/portainer/main.go:649 > starting Portainer | build_number=35428 go_version=1.20.5 image_tag=linux-amd64-2.19.4 nodejs_version=18.19.0 version=2.19.4 webpack_version=5.88.1 yarn_version=1.22.21
portainer | 2024/11/07 03:37PM INF github.com/portainer/portainer/api/http/server.go:357 > starting HTTPS server | bind_address=:9443
portainer | 2024/11/07 03:37PM INF github.com/portainer/portainer/api/http/server.go:341 > starting HTTP server | bind_address=:9000
traefik | 2024-11-07T15:37:47Z ERR Unable to obtain ACME certificate for domains error="cannot get ACME client get directory at 'https://acme-v02.api.letsencrypt.org/directory': Get \"https://acme-v02.api.letsencrypt.org/directory\": dial tcp 172.65.32.248:443: connect: connection refused" ACME CA=https://acme-v02.api.letsencrypt.org/directory acmeCA=https://acme-v02.api.letsencrypt.org/directory domains=[""] providerName=leresolver.acme routerName=edge@docker rule=Host(``)
traefik | 2024-11-07T15:37:47Z ERR Unable to obtain ACME certificate for domains error="cannot get ACME client get directory at 'https://acme-v02.api.letsencrypt.org/directory': Get \"https://acme-v02.api.letsencrypt.org/directory\": dial tcp 172.65.32.248:443: connect: connection refused" ACME CA=https://acme-v02.api.letsencrypt.org/directory acmeCA=https://acme-v02.api.letsencrypt.org/directory domains=[""] providerName=leresolver.acme routerName=frontend@docker rule=Host(``)
traefik | 2024-11-07T15:37:54Z ERR Unable to obtain ACME certificate for domains error="cannot get ACME client get directory at 'https://acme-v02.api.letsencrypt.org/directory': Get \"https://acme-v02.api.letsencrypt.org/directory\": dial tcp 172.65.32.248:443: connect: connection refused" ACME CA=https://acme-v02.api.letsencrypt.org/directory acmeCA=https://acme-v02.api.letsencrypt.org/directory domains=[""] providerName=leresolver.acme routerName=frontend@docker rule=Host(``)
traefik | 2024-11-07T15:37:54Z ERR Unable to obtain ACME certificate for domains error="cannot get ACME client get directory at 'https://acme-v02.api.letsencrypt.org/directory': Get \"https://acme-v02.api.letsencrypt.org/directory\": dial tcp 172.65.32.248:443: connect: connection refused" ACME CA=https://acme-v02.api.letsencrypt.org/directory acmeCA=https://acme-v02.api.letsencrypt.org/directory domains=[""] providerName=leresolver.acme routerName=edge@docker rule=Host(``)
traefik | 2024-11-07T15:37:54Z ERR Unable to obtain ACME certificate for domains error="cannot get ACME client get directory at 'https://acme-v02.api.letsencrypt.org/directory': Get \"https://acme-v02.api.letsencrypt.org/directory\": dial tcp 172.65.32.248:443: connect: connection refused" ACME CA=https://acme-v02.api.letsencrypt.org/directory acmeCA=https://acme-v02.api.letsencrypt.org/directory domains=[""] providerName=leresolver.acme routerName=dashboard@docker rule="Host(``) && (PathPrefix(`/`) || PathPrefix(`/`))"
So for anyone stumbling onto this question in the future I managed to solve it, thanks to a ton of more research. What I had to do is a hard reset of the network config for the host machine:
pkill docker
iptables -t nat -F
ifconfig docker0 down
brctl delbr docker0
ip link del docker0
That's it, after that I ran docker-compose and everything worked.