c++pthreadsmutexfseekfputs

Do we need mutex to perform multithreading file IO


I'm trying to do random write (Benchmark test) to a file using multiple threads (pthread). Looks like if I comment out mutex lock the created file size is less than actual as if Some writes are getting lost (always in some multiple of chunk size). But if I keep the mutex it's always exact size.

Is my code have a problem in other place and mutex is not really required (as suggested by @evan ) or mutex is necessary here

void *DiskWorker(void *threadarg) {

FILE *theFile = fopen(fileToWrite, "a+");
....
for (long i = 0; i < noOfWrites; ++i) {
            //pthread_mutex_lock (&mutexsum);
            // For Random access

            fseek ( theFile , randomArray[i] * chunkSize  , SEEK_SET );
            fputs ( data , theFile );

            //Or for sequential access (in this case above 2 lines would not be here)

            fprintf(theFile, "%s", data);
            //sequential access end

            fflush (theFile);
            //pthread_mutex_unlock(&mutexsum);
        }
.....
}

Solution

  • You definitely need a mutex because you are issuing several different file commands. The underlying file subsystem can't possibly know how many file commands you are going to call to complete your whole operation.

    So you need the mutex.

    In your situation you may find you get better performance putting the mutex outside the loop. The reason being that, otherwise, switching between threads may cause excessive skipping between different parts of the disk. Hard disks take about 10ms to move the read/write head so that could potentially slow things down a lot.

    So it might be a good idea to benchmark that.