I created ns3
as the router namespace.
I created ns1
and ns2
as the clients.
ns3 and ns1 have peer veth3_1, veth1_3.
ns3 and ns2 have peer veth3_2, veth2_3.
A UDP packet from ns1 to ns2 is received in the XDP program deployed in veth2_3.
SEC("xdp_ingress")
int xdp_ingress_func(struct xdp_md* ctx) {
void* data_end = (void*)(long)ctx->data_end;
void* data = (void*)(long)ctx->data;
struct ethhdr* eth = data;
if ((void*)(eth + 1) > data_end) {
return XDP_PASS;
}
if (eth->h_proto != __builtin_bswap16(ETH_P_IP)) {
return XDP_PASS;
}
char tmp_mac[6];
__builtin_memcpy(tmp_mac, eth->h_dest, ETH_ALEN);
__builtin_memcpy(eth->h_dest, eth->h_source, ETH_ALEN);
__builtin_memcpy(eth->h_source, tmp_mac, ETH_ALEN);
return XDP_TX;
}
However, I can only observe the packet to ns2 when using tcpdump on interface veth3_2. But I can't observe the packet forwarding back. Here is the setup shell for environment:
ip link add veth1_3 type veth peer name veth3_1
ip link add veth2_3 type veth peer name veth3_2
# ns3
ip link set veth3_1 netns ns5
ip link set veth3_2 netns ns5
ip netns exec ns3 sysctl -w net.ipv4.ip_forward=1
ip netns exec ns3 ip link add name br0 type bridge
ip netns exec ns3 ip link set br0 up
ip netns exec ns3 ip link set veth3_1 master br0
ip netns exec ns3 ip link set veth3_2 master br0
ip netns exec ns3 ip link set veth3_1 up
ip netns exec ns3 ip link set veth3_2 up
ip netns exec ns3 ip addr add 10.0.0.1/8 dev br0
# ns1
ip link set veth1_3 netns ns1
ip netns exec ns1 ip addr add 10.0.0.2/8 dev veth1_3
ip netns exec ns1 ip link set veth1_3 up
ip netns exec ns1 ip link set lo up
ip netns exec ns1 ip route add default via 10.0.0.1
# ns2
ip link set veth2_3 netns ns2
ip netns exec ns2 ip addr add 10.0.0.3/8 dev veth2_3
ip netns exec ns2 ip link set veth2_3 up
ip netns exec ns2 ip link set lo up
ip netns exec ns2 ip route add default via 10.0.0.1
The above is the simplified problem.
Actually, at veth1_3's tc egress, I push another IP and UDP header before the original L3 header. Like |IP|TCP| to |IP|UDP|IP|TCP|.
I've checked the L2 address is correct.
I checked in bpf_printk that the packet is received by the XDP program, and executed to the return XDP_TX;
.
I found that in the implementation of veth about XDP. You should deploy an at least an XDP programs that does nothing but return XDP_PASS;, then you can use XDP_TX normally.
It is because, XDP_TX transmits data with xdp_ring, but the xdp_ring only works when both sides of veth applied a XDP program.
I noticed that in the older version of veth driver, there is a fallback when the peer is not using XDP. It was tramitting packets in a normal way. But it was removed in current version of kernel.