I have a NodeJS app running inside a k8s pod and I want to take a heap dump of it.
In NodeJS taking a heap dump is time-consuming and blocks the main thread, so the pod is not able to respond to k8s liveness probes and is occasionally SIGKILLed.
Is there a way to prevent such behavior? F.e. stop liveness probes for a pod in runtime for let's say 10 minutes until a dump is ready. Or are there any known practices to handle cases similar to mine?
There is the open issue in k8s with the request similar to mine.
In the end I replaced the http probe with the exec probe to conditionally check some temporary file for existence:
#!/bin/sh
[ -f "/tmp/liveness-status" ] || curl -f http://localhost:8081/status >/dev/null 2>&1
So to run some long-running task (such as taking a dump) on a pod I should at first create some file:
kubectl exec <pod> – touch /tmp/liveness-status
kubectl exec <pod> – rm /tmp/liveness-status
Hope it helps somebody.