Initial situation: I've created a compute-engine VM with 2 nic's, each of them belonging to a different VPC network. GCP only sets up a default route for nic0 belonging to the first VPC. For any other nic the routing must be set up manually to get traffic in the assigned VPC working as expected. Perfectly described here (also works very well):
https://cloud.google.com/vpc/docs/create-use-multiple-interfaces#configuring_policy_routing
The question now is:
How could one make this additional routing table persistent on a Debian machine, e.g. to survive reboots?
The following was already tried out without success:
ip route
and ip rule
commands in the VM's startup scriptSearching around for the issue it seems that the root cause is that the setup of the nic's at system startup is made by a Google network daemon and it is undefined when this is done. A workaround that succeeded is simply to wait some seconds in the startup-script, but that's obviously not a nice and bulletproof solution. Also a more maintainable solution then a startup-script would be preferred.
Any better advices?
After spending more research on this I think I have a feasible solution that's maintainable and scalable enough.
Here are the results & solution:
Compute Engine obviously generates configuration files on the
fly for every nic (in my case ens4
and ens5
). The "traditional" configuration files are not used as usual.
test@vm-1:~$ cat /etc/network/interfaces
# Include files from /etc/network/interfaces.d:
source-directory /etc/network/interfaces.d
# Cloud images dynamically generate config fragments for newly
# attached interfaces. See /etc/udev/rules.d/75-cloud-ifupdown.rules
# and /etc/network/cloud-ifupdown-helper. Dynamically generated
# configuration fragments are stored in /run:
source-directory /run/network/interfaces.d
test@vm-1:~$ ls -la /etc/network/interfaces.d/
total 0
test@vm-1:~$ ls -la /run/network/interfaces.d/
total 8
drwxr-xr-x 2 root root 80 Dec 13 20:52 .
drwxr-xr-x 3 root root 160 Dec 13 20:52 ..
-rw-r--r-- 1 root root 51 Dec 13 20:52 ens4
-rw-r--r-- 1 root root 51 Dec 13 20:52 ens5
The original file /etc/network/interfaces
just serves for forwarding to the generated files under /run/network/interfaces.d
. You can easily check that they are newly generated at each instance startup.
Placing any additional configuration here may not be the best solution as it interfers with GCP's standard procedure.
Instead I added additional needed routing configuration in a new script under /etc/network/if-up.d/
. Like so:
test@vm-1:~$ cat /etc/network/if-up.d/ens5routing
#!/bin/sh
if [ "$IFACE" = ens5 ]; then
IP=$(curl -s http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/1/ip -H "Metadata-Flavor: Google")
GW=$(curl -s http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/1/gateway -H "Metadata-Flavor: Google")
ifconfig ens5 $IP netmask 255.255.255.255 broadcast $IP mtu 1430
ip route add $GW src $IP dev ens5 table rt1
ip route add default via $GW dev ens5 table rt1
ip rule add from $IP/32 table rt1
ip rule add to $IP/32 table rt1
echo "Routing table for ens5 set up using IP $IP and GW $GW"
fi
Please note that the script must be executable and should NOT have any extension like .sh
. Furthermore the scripts in this directory are called for each network interface coming up, so make sure you only execute it for the desired nic(s). The interface coming up is passed in the variable IFACE
to the script. For more details on how to build such a script refer to https://manpages.debian.org/testing/ifupdown/interfaces.5.en.html.
To prevent the Google network daemon from overwriting our routing setup change setup
to false
in the network section of /etc/default/instance_configs.cfg
:
[NetworkInterfaces]
dhclient_script = /sbin/google-dhclient-script
dhcp_command =
ip_forwarding = true
setup = false
Some further improvements of the if-up script would be necessary to make this solution a bit more generic and applicable for auto-scaling etc. (e.g. dynamic ip address retrieval via meta-data). But in general this solution works.
EDIT
Updated the provided if-up script under step 2. to use metadata retrieval via curl
to dynamically get IP and gateway of the Compute Engine instance. For details please refer to the metadata docs.
Solution now works well with creating machine images images and instance templates for auto-scaling.
If you need a hint on how to create an Compute Engine instance template with >1 NIC please see my GIST for that.