I'm interested in using netlink for a straightforward application (reading cgroup stats at high frequency).
The man page cautions that the protocol is not reliable, hinting that the application needs to be prepared to handle dropped packets:
However, reliable transmissions from kernel to user are impossible in any case. The kernel can't send a netlink message if the socket buffer is full: the message will be dropped and the kernel and the user-space process will no longer have the same view of kernel state. It is up to the application to detect when this happens (via the
ENOBUFS
error returned byrecvmsg(2)
) and resynchronize.
Since my requirements are simple, I'm fine with just destroying the socket and creating a new one whenever anything unexpected happens. But I can't find any documentation on what the expectations are on my program—the man page for recvmsg(2)
doesn't even mention ENOBUFS
for example.
What all do I need to worry about in order to make sure I can tell that a request from my application or a response from the kernel has been dropped, so that I can reset everything and start over? It's clear to me that I could do so whenever I receive an error from any of the syscalls involved, but for example what happens if my request is dropped on the way to the kernel? Will I just never receive a response? Do I need to build a timeout mechanism where I wait only so long for a response?
I found the following in Communicating between the kernel and user-space in Linux using Netlink sockets by Ayuso, Gasca, and Lefevre:
If Netlink fails to deliver a message that goes from kernel to user-space, the
recvmsg()
function returns the No buffer space available (ENOBUFS
) error. Thus, the user-space process knows that it is losing messages [...]On the other hand, buffer overruns cannot occur in communications from user to kernel-space since
sendmsg()
synchronously passes the Netlink message to the kernel subsystem. If blocking sockets are used, Netlink is completely reliable in communications from user to kernel-space since memory allocations would wait, so no memory exhaustion is possible.
Regarding acks, it looks like worrying about them is optional:
NLM_F_ACK
: the user-space application requested a confirmation message from kernel-space to make sure that a given request was successfully performed. If this flag is not set, the kernel-space reports the error synchronously viasendmsg()
aserrno
value.
So it sounds like for my simplistic use case I can just use sendmsg
and recvmsg
naively, reacting to any error (except for EINTR
) by starting the whole thing over, perhaps with backoff. My guess it that since I only get one response per request and the responses are tiny, I should never even see ENOBUFS
as long as I have only one request in flight at at a time.