clinuxopenmpmmapsigbus

Valid read from mmap-ed memory produces SIGBUS under load. Why?


I have a program that copies buffers to files, mmap's them back and then checks their contents. Multiple threads can work on the same file. Occasionally, I am getting SIGBUS when reading, but only under load.

The mappings are MAP_PRIVATE and MAP_POPULATE. The crash via SIGBUS occurs after mmap was successful which I do not understand since MAP_POPULATE was used.

Here is a full example (creates files under /tmp/buf_* filled with zeroes), using OpenMP to create more load and concurrent writes:

// Program to check for unexpected SIGBUS
// gcc -std=c99 -fopenmp -g -O3 -o mmap_manymany mmap_manymany.c
#include <assert.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>

#define NBUFS 64
const char bufs[NBUFS][65536] = {{0}};
const char zeros[65536] = {0};

int main()
{
  int count = 0;
  while ( 1 )
  {
    void *mappings[ 1000 ] = {NULL};

#pragma omp parallel for
    for ( int i = 0; i < 1000; ++i )
    {
      // Prepare filename
      int bufIdx = i % NBUFS;
      char path[ 128 ] = { 0 };
      sprintf( path, "/tmp/buf_%0d", bufIdx );

      // Write full buffer
      int outFd = -1;
#pragma omp critical
      {
        remove( path );
        outFd = open( path, O_EXCL | O_CREAT | O_WRONLY | O_TRUNC, 0644 );
      }
      assert( outFd != -1 );
      ssize_t size = write( outFd, bufs[bufIdx], 65536 );
      assert( size == 65536 );
      close( outFd );

      // Map it to memory
      int inFd = open( path, O_RDONLY );
      if ( inFd == -1 )
        continue; // Deleted by other thread. Nevermind

      mappings[i] = mmap( NULL, 65536, PROT_READ, MAP_PRIVATE | MAP_POPULATE, inFd, 0 );
      assert( mappings[i] != MAP_FAILED );
      close( inFd );

      // Read data immediately. Creates occasional SIGBUS but only under load.
      int v = memcmp( mappings[i], zeros, 65536 );
      assert( v == 0 );
    }

    // Clean up
    for ( int i = 0; i < 1000; ++i )
      munmap( mappings[ i ], 65536 );
    printf( "count: %d\n", ++count );
  }
}

No assert fires for me, but the program always crashes after a few seconds with SIGBUS.


Solution

  • With your current program, it can happen that thread 0 creates /tmp/buf_0, writes to it and closes it. Then thread 1 removes and creates /tmp/buf_0, but before thread 1 writes to it, thread 0 opens, maps, and reads from /tmp/buf_0 - and thus tries to access a file does not yet contain 64 kiB data. You get a SIGBUS.

    To avoid that issue, just make unique files / and bufs for each thread, by using omp_get_thread_num() instead of bufIdx.