I am running a simulation of random events happening in accordance with a given set of probabilities, until n events are hit, and repeating for as many iterations as desired. When I run this sequentially, it finishes a small test set of 24 iterations in 5 seconds. When I set a higher number of threads (16), the time involved becomes 55 seconds. I made these comparisons using time bash -c "..."
.
I am sorry for asking another one of these questions, but I have been searching for a solution for several hours now. I think the problem has to be some sort of race condition, but I just cannot seem to figure out on what. Variables target, n_probabilities and probabilities should all be private to the thread, and the majority of the time is spent between the print statements 'Started thread %d' and 'Finished thread %d'. This suggests to me that mutation_iterations[i] = simulate_iteration(target, n_probabilities, probabilities);
then has to be the offending line, and mutation_iterations is a shared variable, but it should only be accessed a single time per thread so I don't see how it could cause this much of a speed drop.
However, looking at gprof output I don't really see an unusual amount of time spent on the function in question (although the 6.15 for main seems weird to me, as it is not the actual runtime):
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
98.86 6.08 6.08 24 253.33 253.33 simulate_iteration
0.98 6.14 0.06 _init
0.16 6.15 0.01 main
Admittedly this is my first time reading gprof, so I may be reading this entirely wrong. But as I see it, my program seems to be running for 49 seconds longer than gprof says it should, and I cannot figure out where this comes from.
The offending code:
int **mutation_iterations = malloc(iterations * sizeof(int *));
srand(0); // Set random number seed
#pragma omp parallel shared(mutation_iterations)
{
int target = desired_mutations;
int n_probabilities = simulation_size;
double probabilities[simulation_size];
for (int i = 0; i < simulation_size; i++) {
probabilities[i] = simulation_set[i].probability;
}
#pragma omp for
for (int i = 0; i < iterations; i++) {
mutation_iterations[i] = NULL;
printf("Started thread %d\n", i);
mutation_iterations[i] = simulate_iteration(target, n_probabilities, probabilities);
printf("Finished thread %d\n", i);
}
}
The simulate_iteration(...) function:
int * simulate_iteration(int target, int n_probabilities, double probabilities[]) {
int *counts = calloc(n_probabilities, sizeof(int));
int total = 0;
while (total < target) {
for (int i = 0; i < n_probabilities; i++) {
double rand_val = (double) rand() / RAND_MAX;
if (rand_val < probabilities[i]) {
counts[i]++;
total++;
}
}
}
return counts;
}
I must be overlooking something but I just can't see what. Any and all help would be much appreciated.
Turns out I was overlooking a contention in rand(), thanks @Homer512 for pointing it out. I ended up using rand_r and that fixed it.
For anyone trying to do something similar, here is a minimum reproducible sample of the null mutational model I was making with the fix included:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <csv.h>
#include <omp.h>
#include <pthread.h>
int * simulate_iteration(unsigned int c_iteration, int target, int n_probabilities, double probabilities[]) {;
int *counts = calloc(n_probabilities, sizeof(int));
int total = 0;
while (total < target) {
for (int i = 0; i < n_probabilities; i++) {
double rand_val = (double) rand_r(&c_iteration) / RAND_MAX;
if (rand_val < probabilities[i]) {
counts[i]++;
total++;
}
}
}
return counts;
}
int main(int argc, char *argv[]) {
int iterations = 24;
int **mutation_iterations = malloc(iterations * sizeof(int *));
#pragma omp parallel shared(mutation_iterations)
{
unsigned int gen_seed = 0;
int target = 500;
int n_probabilities = 500;
double probabilities[500];
for (int i = 0; i < n_probabilities; i++) {
probabilities[i] = (double) rand_r(&gen_seed) / RAND_MAX / 10;
}
#pragma omp for
for (int i = 0; i < iterations; i++) {
mutation_iterations[i] = NULL;
mutation_iterations[i] = simulate_iteration(i, target, n_probabilities, probabilities);
}
}
// Print the results
printf("gene");
for (int i = 0; i < 500; i++) {
printf("\titer_%d", i);
}
printf("\n");
for (int j = 0; j < 500; j++) {
printf("%d", j);
for (int i = 0; i < iterations; i++) {
printf("\t%d", mutation_iterations[i][j]);
}
printf("\n");
}
for (int i = 0; i < iterations; i++) free(mutation_iterations[i]);
free(mutation_iterations);
return 0;
}