How can I print from a loop with OpenMP parallel execution? I am hoping to avoid critical code or similar which (I hear) can really slow down the execution.
Additional complication: it seem that on the cluster computer which I ran my code on, I can only print from the master thread (id=0). So I tried the following code. It mostly works, but when a worker thread writes to a stringstream
concurrently with the main thread reading it, it can lead to an abnormal behavior.
ADDED: 1. The order of output doesnt matter. 2. Since my job will be terminated after 48h, while possibly still unfinished, I can't wait with printing till the end of the loop computation.
#include <iostream>
#include <vector>
#include <omp.h>
#include <sstream>
using namespace std;
int main() {
int nthreads=10;
omp_set_num_threads(nthreads);
vector<stringstream> ss(nthreads);
#pragma omp parallel for schedule(static,1)
for(unsigned int i= 1; i<=100000;i++){
int id=omp_get_thread_num();
if(id==0){
for(int idi=0;idi<nthreads;idi++)
if(ss[idi].tellp()>ss[idi].tellg()) cout<<ss[idi].rdbuf();
}
// DO WORK,
if(some condition) ss[id]<< some outcome...
// MORE WORK
ss[id]<< more outcome
}
return 0;
}
The standard simple solution requires C++20. First observation is std::osyncstream
from <syncstream>
header:
osyncstream(std::cout) << my_line << "\n;
But if you can use C++20, then <format>
will be available too:
std::puts(std::format("{:s}\n", my_line).c_str());
C IO functions like std::puts
acquire an internal lock on every invocation.
If C++23 is available too, you can use <print>
:
std::println(stdout, "{:s}", my_line);
The latest standard requires that std::print
and std::println
too, acquire a lock on the output stream before printing. But since it's relatively new, that guarantee is subject to quality of implementation.
Last resort for optimization would be to define a spin lock using std::atomic
and lock+unlock on every call. It would be faster than a mutex, but I don't illustrate it now.