I have read up on the Kubernetes docs but I'm unable to get a clear answer on my question. I'm using the official cluster-autoscaler.
From what I understand, seamless updates are easy with RollingUpdate strategy. I have not find the same "Rolling" strategy to be possible for scale-down.
EDIT
TL;DR I'm looking for HA on a) two+ replica deployment and b) one replica deployment
a) Can be achieved by using PDBs. Checkout Fritz's answer. If you need pods scheduled on different nodes, leverage anti-affinity (Marc's answer)
b) If you're okay with short disruption, PDB is the official way to go. If you need a workaround, my answer can be of inspiration.
The scale down behavior can be configured with what is called a Disruption Budget
In your Deployment Manifest you can define maxUnavailable
and minAvailable
number of Pods during voluntary disruptions like draining nodes.
For how to do it, check out the K8s Documentation.