I am learning how to use threads in C and have run into a problem when creating the threads. I am making a program that takes in 2 or more file names as command line arguments, counts the number of bytes in each file in their own thread, and then outputs the name of the largest file. When I use pthread_join() directly after creating a thread, the program runs as intended. However, I know this isn't how threads should be used because it defeats the purpose. When I use pthread_join() in a for loop after creating all the threads, then the program does not work correctly. Could anyone tell me what I am doing wrong? All help is appreciated. Here is my main function.
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; //mutex for changing max_bytes and max_name
int max_bytes = 0;
char max_name[100];
struct arg_struct{ //struct to hold args to pass the threads
int fd;
char name[100];
};
int main(int argc, char* argv[])
{
if(argc < 3){ //checks for correct number of arguments passed
perror("Wrong number of arguments");
return EXIT_FAILURE;
}
int arg_num = argc - 1; //holds number of arguments passed
pthread_t threadid[arg_num]; //array of thread IDs
struct arg_struct args;
for(int i = 0; i < arg_num; i++){
args.fd = open(argv[i+1], O_RDONLY);
memcpy(args.name, argv[i+1], sizeof(args.name)); //copies file name into arg_struct
int thread_err = pthread_create(&threadid[i], NULL, count_bytes, (void*)&args); //create thread by calling count_bytes and passing it a struct of args
//pthread_join(threadid[i], NULL);
if(thread_err != 0){
perror("pthread_create failed");
return EXIT_FAILURE;
}
}
for(int i = 0; i < arg_num; i++){
pthread_join(threadid[i], NULL);
}
printf("%s is the largest of the submitted files\n", max_name);
return 0;
}
This is the function that the threads are running.
void *count_bytes(void* arguments)
{
struct arg_struct *args = (struct arg_struct*)arguments; //casting arguments back to struct from void*
int fd = args -> fd;
char name[100];
memcpy(name, args -> name, sizeof(name)); //copies file name into name from args.name
int bytes = 0;
int size = 10;
char* buffer = (char*) malloc(size);
if(buffer == NULL){
perror("malloc failed");
exit(EXIT_FAILURE);
}
int buffer_count = 0;
for(int i = 0; i < size; i++){
buffer[i] = '\0'; //sets all elements to '\0' to determine end of file later
}
int read_return = read(fd, &buffer[buffer_count], 1);
if(read_return == -1){
perror("reading failed");
exit(EXIT_FAILURE);
}
while(buffer[buffer_count] != '\0'){
bytes++;
buffer_count++;
buffer[buffer_count] = '\0'; //sets previous element to '\0' to determine end of file later
if(buffer_count >= size){
buffer_count = 0; //buffer will hold up to 10 elements and then go back to the beginning
}
read_return = read(fd, &buffer[buffer_count], 1);
if(read_return == -1){
perror("reading failed");
exit(EXIT_FAILURE);
}
}
printf("%s has %d bytes\n", name, bytes);
pthread_mutex_lock(&mutex);
if(bytes > max_bytes){
max_bytes = bytes;
memcpy(max_name, name, sizeof(max_name));
}
//locks mutex to avoid race condition
//then sets bytes to max_bytes if it is later than max_bytes
//then locks mutex to allow another thread to have access
pthread_mutex_unlock(&mutex);
return NULL;
}
If it is of any use, these are the two outputs produced when it is running correctly
./a.out another buffered_readword.c
another has 8 bytes
buffered_readword.c has 3747 bytes
buffered_readword.c is the largest of the submitted files
And not correctly
./a.out another buffered_readword.c
buffered_readword.c has 1867 bytes
buffered_readword.c has 1881 bytes
buffered_readword.c is the largest of the submitted files
The problem is that there is only one args
structure. After pthread_create
is called the new thread may not run immediately. By the time the threads run it is likely that they will both see the same args
values. Calling pthread_join
inside the thread creation loop "fixes" that because it ensures each thread finishes before args
is updated to the next value.
To fix properly pass a different args
to each thread. Illustrative code to do that:
struct arg_struct args[arg_num];
for(int i = 0; i < arg_num; i++){
args[i].fd = open(argv[i+1], O_RDONLY);
memcpy(args[i].name, argv[i+1], sizeof(args[i].name));
int thread_err = pthread_create(&threadid[i], NULL, count_bytes, &args[i]);
....