I have a basic container that opens up a ssh
tunnel to a machine.
Recently I noticed the container has exited with error code 255
with an error message saying the task already exists
:
"Id": "7eb92418992a1a1c3e44d6b47257dc503d4fa4d0f26050956533d617ac369479",
"Created": "2022-08-29T18:19:41.286843867Z",
"Path": "sh",
"Args": [
"-c",
"apk update && apk add openssh-client &&\n chmod 400 ~/.ssh/abc.pem\n while true; do \n exec ssh -o StrictHostKeyChecking=no -i ~/.ssh/abc.pem -nNT -L *:33333:localhost:5001 abc@192.168.1.1; \n done"
],
"State": {
"Status": "exited",
"Running": false,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 0,
"ExitCode": 255,
"Error": "task 7eb92418992a1a1c3e44d6b47257dc503d4fa4d0f26050956533d617ac369479: already exists",
"StartedAt": "2022-08-30T19:43:58.575463029Z",
"FinishedAt": "2022-08-30T19:51:23.511624168Z"
},
More importantly even though the restart policy is always
, the docker engine did not start the container after the container exit.
abc:
container_name: abc
image: alpine:latest
restart: always
command: >
sh -c "apk update && apk add openssh-client &&
chmod 400 ~/.ssh/${PEM_FILENAME}
while true; do
exec ssh -o StrictHostKeyChecking=no -i ~/.ssh/${PEM_FILENAME} -nNT -L *:33333:localhost:5001 abc@${IP};
done"
volumes:
- ./ssh:/root/.ssh:rw
expose:
- 33333
task already exists
can happen?Update 1:
A restart policy only takes effect after a container starts successfully. In this case, starting successfully means that the container is up for at least 10 seconds and Docker has started monitoring it. This prevents a container which does not start at all from going into a restart loop.
Sine we have:
"StartedAt": "2022-08-30T19:43:58.575463029Z",
"FinishedAt": "2022-08-30T19:51:23.511624168Z"
then FinishedAt - StartedAt ~ 8 seconds < 10 seconds
that's why docker engine is not restarting the container. Which I think it is not a good logic. docker engine should have a retry mechanism to retry for instance at least 3 times before giving up.
I would suggest this solution:
create Dockerfile
in an empty folder as:
FROM alpine:latest
RUN apk update && apk add openssh-client
build the image:
docker build -t alpinessh .
Run it with docker run
:
docker run -d \
--restart "always" \
--name alpine_ssh \
-u $(id -u):$(id -g) \
-v $HOME/.ssh:/user/.ssh \
-p 33333:33333 \
alpinessh \
ssh -o StrictHostKeyChecking=no -i /user/.ssh/${PEM_FILENAME} -nNT -L :33333:localhost:5001 abc@${IP}
(make sure to set the env variables that you need)
Running with docker-compose follows the same logic.
** NOTE **
Mapping ~/.ssh
inside the container is not the best of ideas. It would be better to copy the key to a different location and use it from there. Reason is: inside the container you are root and any files created in your ~/.ssh by the container would be created/accessed by root (uid=0). For example known_hosts
- if you don't already have one, you will get a fresh new one owned by root.
For this reason I am running the container as the current UID:GID on the host.