Often syscalls like write(2)
, read(2)
, close(2)
et cetera fail due to being interrupted by a signal with the errno
value EINTR
(say the size of the terminal window changed and SIGWINCH
was received), which is a transient error and ought to be retried, and code often uses wrappers around these sycalls that retry on EINTR
(and often EAGAIN
or ENOBUFS
).
But it is possible to get stuck in the theoretical situation where code just continue infinitely looping on EINTR
due to either receiving non-stop signals, or because the syscall was intercepted by a custom implementation of that syscall that just returns EINTR
.
In such cases, in library code, how many times does it make sense to retry the syscall?
For wrappers around syscalls that retry on EINTR, how many times does retrying make sense?
From zero to infinitely many.
Glibc standard library is in use in billions of devices around the world. Calling printf("Hello world\n");
will end up in the _IO_new_file_write
function that looks like the following, from https://github.com/bminor/glibc/blob/5aa2f79691ca6a40a59dfd4a2d6f7baff6917eb7/libio/fileops.c#L1176 :
ssize_t
_IO_new_file_write (FILE *f, const void *data, ssize_t n)
{
ssize_t to_do = n;
while (to_do > 0)
{
ssize_t count = (__builtin_expect (f->_flags2
& _IO_FLAGS2_NOTCANCEL, 0)
? __write_nocancel (f->_fileno, data, to_do)
: __write (f->_fileno, data, to_do));
if (count < 0)
{
f->_flags |= _IO_ERR_SEEN;
break;
}
to_do -= count;
data = (void *) ((char *) data + count);
}
n -= to_do;
if (f->_offset >= 0)
f->_offset += n;
return n;
}
As you can, while (to_do > 0)
the function will loop infinitely many times until the data are written, ignorin any EINTR
signal and not even checking for any.
Because this software is used in literally almost every single linux device around the world, it is safe to say that looping infinitely many times is completely fine.
Now, you may be working with a non-standard implementation of write. For example on an embedded device the programmer may implement his own implementation of write
, like _write_r
when using Newlib C standard library. If such a programmer sets errno = EINTR
and returns 0 from his write
function endlessly, I would say that's on him. But if you feel like you want to detect such situations, go ahead. I do not feel like there is the need to do it.
The contract of write
function is just that when the number of bytes written is not equal to how many bytes you wanted to write, you should repeat the call, with shifted data and count. That's that.