I have an application that runs on a large number of processors. On processor 0, I have a function that writes data to a socket if it is open. This function runs in a loop in a separate thread on processor 0, i.e. processor 0 is responsible for its own workload and has an extra thread running the communication on the socket.
//This function runs on a loop, called every 1.5 seconds
void T_main_loop(const int& client_socket_id, bool* exit_flag)
{
//Check that socket still connected.
int error_code;
socklen_t error_code_size = sizeof(error_code);
getsockopt(client_socket_id, SOL_SOCKET, SO_ERROR, &error_code, &error_code_size);
if (error_code == 0)
{
//send some data
int valsend = send(client_socket_id , data , size_of_data , 0);
}
else
{
*(exit_flag) = false; //This is used for some external logic.
//Can I fix the broklen pipe here somehow?
}
}
When the client socket is closed, the program should just ignore the error, and this is standard behavior as far as I am aware.
However, I am using an external library (PETSc) that is somehow detecting the broken pipe error and closing the entire parallel (MPI) environment:
[0]PETSC ERROR: Caught signal number 13 Broken Pipe: Likely while reading or writing to a socket
I would like to leave the configuration of this library completely untouched if at all possible. Open to any robust workarounds that are possible.
By default, the OS sends the thread SIGPIPE
if it tries to write into a (half) closed pipe or socket.
One option to disable the signal is to do signal(SIGPIPE, SIG_IGN);
.
Another option is to use MSG_NOSIGNAL
flag for send
, e.g. send(..., MSG_NOSIGNAL);
.