I'd like to setup a NAT gateway, using Cloud NAT, so that VMs/Pods in a public GKE cluster use static IP addresses.
The issue I'm facing is that the NAT gateway seems to only be used if VMs have no other options, i.e:
GCP forwards traffic using Cloud NAT only when there are no other matching routes or paths for the traffic.
But in the case of a public GKE cluster, VMs have ephemeral external IPs, so they don't use the gateway.
According to the doc:
If you configure an external IP on a VM's interface [...] NAT will not be performed on such packets. However, alias IP ranges assigned to the interface can still use NAT because they cannot use the external IP to reach the Internet.
And
With this configuration, you can connect directly to a GKE VM via SSH, and yet have the GKE pods/containers use Cloud NAT to reach the Internet.
That's what I want, but I fail to see what precisely to setup here.
What is implied by alias IP ranges assigned to the interface can still use NAT
and how to set this up?
The idea here is that if you use native VPC (Ip alias) for the cluster, your pods will not use SNAT when routing out of the cluster. With no SNAT, the pods will not use the node's external IP and thus should use the Cloud NAT.
Unfortunately, this is not currently the case. While Cloud NAT is still in Beta, certain settings are not fully in place and thus the pods are still using SNAT even with IP aliasing. Because of the SNAT to the node's IP, the pods will not use Cloud NAT.
This being said, why not use a private cluster? It's more secure and will work with Cloud NAT. You can't SSH directly into a node, but A) you can create a bastion VM instance in your project that can SSH using the internal IP flag and B) you generally do not need to SSH into the node on most occassions.