How to correctly handle SIGBUS so I can continue to search an address?

I'm currently working on a project running on a heavily modified version of Linux patched to be able to access a VMEbus. Most of the bus-handling is done, I have a VMEAccess class that uses mmap to write at a specific address of /dev/mem so a driver can pull that data and push it onto the bus.

When the program starts, it has no idea where the slave board it's looking for is located on the bus so it must find it by poking around: it tries to read every address one by one, if a device is connected there the read method returns some data but if there isn't anything connected a SIGBUS signal will be sent to the program.

I tried several solutions (mostly using signal handling) but after some time, I decided on using jumps. The first longjmp() call works fine but the second call to VMEAccess::readWord() gives me a Bus Error even though my handler should prevent the program from crashing.

Here's my code:

#include <iostream>
#include <string>
#include <sstream>
#include <csignal>
#include <cstdlib>
#include <csignal>
#include <csetjmp>

#include "types.h"
#include "VME_access.h"

VMEAccess *busVME;

int main(int argc, char const *argv[]);
void catch_sigbus (int sig);
void exit_function(int sig);

volatile BOOL bus_error;
volatile UDWORD offset;
jmp_buf env;

int main(int argc, char const *argv[])
{
    sigemptyset(&sigBusHandler.sa_mask);

    struct sigaction sigIntHandler;

    sigIntHandler.sa_handler = exit_function;
    sigemptyset(&sigIntHandler.sa_mask);
    sigIntHandler.sa_flags = 0;

    sigaction(SIGINT, &sigIntHandler, NULL);

    /*   */
    struct sigaction sigBusHandler;

    sigBusHandler.sa_handler = catch_sigbus;
    sigemptyset(&sigBusHandler.sa_mask);
    sigBusHandler.sa_flags = 0;

    sigaction(SIGBUS, &sigBusHandler, NULL);

    busVME = new VMEAccess(VME_SHORT);

    offset = 0x01FE;

    setjmp(env);
    printf("%d\n", sigismember(&sigBusHandler.sa_mask, SIGBUS));

    busVME->readWord(offset);
    sleep(1);

    printf("%#08x\n", offset+0xC1000000);

    return 0;
}

void catch_sigbus (int sig)
{
    offset++;
    printf("%#08x\n", offset);
    longjmp(env, 1);
}

void exit_function(int sig) 
{
    delete busVME;
    exit(0);
}

Solution

As mentioned in the comments, using longjmp in a signal handler is a bad idea. After doing the jump out of a signal handler your program is effectively still in the signal handler. So calling non-async-signal-safe functions leads to undefined behavior for example. Using siglongjmp won't really help here, quoting man signal-safety:

If a signal handler interrupts the execution of an unsafe function, and the handler terminates via a call to longjmp(3) or siglongjmp(3) and the program subsequently calls an unsafe function, then the behavior of the program is undefined.

And just for example, this (siglongjmp) did cause some problems in libcurl code in the past, see here: error: longjmp causes uninitialized stack frame

I'd suggest to use a regular loop and modify the exit condition in the signal handler (you modify the offset there anyway) instead. Something like the following (pseudo-code):

int had_sigbus = 0;

int main(int argc, char const *argv[])
{
    ...
    for (offset = 0x01FE; offset is sane; ++offset) {
        had_sigbus = 0;
        probe(offset);
        if (!had_sigbus) {
            // found
            break;
        }
    }
    ...
}

void catch_sigbus(int)
{
    had_sigbus = 1;
}

This way it's immediately obvious that there is a loop, and the whole logic is much easier to follow. And there are no jumps, so it should work for more than one probe :) But obviously probe() must handle the failed call (the one interrupted with SIGBUS) internally too - and probably return an error. If it does return an error using the had_sigbus function might be not necessary at all.