the standard linux fcntl
call doesn't provide a timeout option. I'm considering implement a timeout lock with signal.
Here is the description of blocking lock:
F_SETLKW
This command shall be equivalent to F_SETLK except that if a shared or exclusive lock is blocked by other locks, the thread shall wait until the request can be satisfied. If a signal that is to be caught is received while fcntl() is waiting for a region, fcntl() shall be interrupted. Upon return from the signal handler, fcntl() shall return -1 with errno set to [EINTR], and the lock operation shall not be done.
So what kind of signal I need to use to indicate the lock to be interrupted? And since there're multiple threads running in my process, I only want to interrupt this IO thread who is blokcing for the file lock, other threads should not be affected, but signal is process-level, I'm not sure how to handle this situation.
I've written a simple implemnetation using signal.
int main(int argc, char **argv) {
std::string lock_path = "a.lck";
int fd = open(lock_path.c_str(), O_CREAT | O_RDWR, S_IRWXU | S_IRWXG | S_IRWXO);
if (argc > 1) {
signal(SIGALRM, [](int sig) {});
std::thread([](pthread_t tid, unsigned int seconds) {
sleep(seconds);
pthread_kill(tid, SIGALRM);
}, pthread_self(), 3).detach();
int ret = file_rwlock(fd, F_SETLKW, F_WRLCK);
if (ret == -1) std::cout << "FAIL to acquire lock after waiting 3s!" << std::endl;
} else {
file_rwlock(fd, F_SETLKW, F_WRLCK);
while (1);
}
return 0;
}
by running ./main
followed by ./main a
, I expect the first process holding the lock forever, and second process try to get the lock and interrupted after 3s, but the second process just terminated.
Could anyone tell me what's wrong with my code?
So what kind of signal I need to use to indicate the lock to be interrupted?
The most obvious choice of signal would be SIGUSR1
or SIGUSR2
. These are provided to serve user-defined purposes.
There is also SIGALRM
, which would be natural if you're using a timer that produces such a signal to do your timekeeping, and which makes some sense even to generate programmatically, as long as you are not using it for other purposes.
And since there're multiple threads running in my process, I only want to interrupt this IO thread who is blokcing for the file lock, other threads should not be affected, but signal is process-level, I'm not sure how to handle this situation.
You can deliver a signal to a chosen thread in a multithreaded process via the pthread_kill()
function. This also stands up well to the case where more than one thread is waiting on a lock at the same time.
With regular kill()
, you also have the alternative of making all threads block the chosen signal (sigprocmask()
), and then having the thread making the lock attempt unblock it immediately prior. When the chosen signal is delivered to the process, a thread that is not presently blocking it will receive it, if any such thread is available.
This supposes that a signal handler has already been set up to handle the chosen signal (it doesn't need to do anything), and that the signal number to use is available via the symbol LOCK_TIMER_SIGNAL
. It provides the wanted timeout behavior as a wrapper function around fcntl()
, with command F_SETLKW
as described in the question.
#define _POSIX_C_SOURCE 200809L
#define _GNU_SOURCE
#include <unistd.h>
#include <signal.h>
#include <time.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/syscall.h>
#if (__GLIBC__ < 2) || (__GLIBC__ == 2 && __GLIBC_MINOR__ < 30)
// glibc prior to 2.30 does not provide a wrapper
// function for this syscall:
static pid_t gettid(void) {
return syscall(SYS_gettid);
}
#endif
/**
* Attempt to acquire an fcntl() lock, with timeout
*
* fd: an open file descriptor identifying the file to lock
* lock_info: a pointer to a struct flock describing the wanted lock operation
* to_secs: a time_t representing the amount of time to wait before timing out
*/
int try_lock(int fd, struct flock *lock_info, time_t to_secs) {
int result;
timer_t timer;
result = timer_create(CLOCK_MONOTONIC,
& (struct sigevent) {
.sigev_notify = SIGEV_THREAD_ID,
._sigev_un = { ._tid = gettid() },
// note: gettid() conceivably can fail
.sigev_signo = LOCK_TIMER_SIGNAL },
&timer);
// detect and handle errors ...
result = timer_settime(timer, 0,
& (struct itimerspec) { .it_value = { .tv_sec = to_secs } },
NULL);
result = fcntl(fd, F_SETLKW, lock_info);
// detect and handle errors (other than EINTR) ...
// on EINTR, may want to check that the timer in fact expired
result = timer_delete(timer);
// detect and handle errors ...
return result;
}
That works as expected for me.
try_lock
function itself to modify the disposition of its chosen signal.timer_*
interfaces provide POSIX interval timers, but the provision for designating a specific thread to receive signals from such a timer is Linux-specific.-lrt
for the timer_*
functions.struct sigevent
does not conform to its own docs (at least in relatively old version 2.17). The docs claim that struct sigevent
has a member sigev_notify_thread_id
, but in fact it does not. Instead, it has an undocumented union containing a corresponding member, and it provides a macro to patch up the difference -- but that macro does not work as a member designator in a designated initializer.fcntl
locks operate on a per-process basis. Thus, different threads of the same process cannot exclude each other via this kind of lock. Moreover, different threads of the same process can modify fcntl()
locks obtained via other threads without any special effort or any notification to either thread.fcntl()
will return EINTR
if interrupted by any signal that does not terminate the thread. You might, therefore, want to use a signal handler that sets an affirmative per-thread flag by which you can verify that the actual timer signal was received, so as to retry the lock if it was interrupted by a different signal.EINTR
.