How can OpenMP know how many loop instances are nested?
Is it explicitly counted by compiler?
The OpenMP runtime keeps track of this information in thread-local variables.
Probably one of the most popular OpenMP implementations out there, libgomp
, is open-source; That means one can read not just its documentation but also its source code entirely free.
The implementation of omp_get_level()
is here:
int
omp_get_level (void)
{
return gomp_thread ()->ts.level;
}
The implementation of gomp_thread()
is here. It retrieves a pointer to a thread-local structure.
#if defined __nvptx__
extern struct gomp_thread *nvptx_thrs __attribute__((shared));
static inline struct gomp_thread *gomp_thread (void)
{
int tid;
asm ("mov.u32 %0, %%tid.y;" : "=r" (tid));
return nvptx_thrs + tid;
}
#elif defined HAVE_TLS || defined USE_EMUTLS
extern __thread struct gomp_thread gomp_tls_data;
static inline struct gomp_thread *gomp_thread (void)
{
return &gomp_tls_data;
}
#else
extern pthread_key_t gomp_tls_key;
static inline struct gomp_thread *gomp_thread (void)
{
return pthread_getspecific (gomp_tls_key);
}
#endif
The data structure ts
is a struct gomp_team_state
that, amongst others, contains:
[...]
/* Nesting level. */
unsigned level;
/* Active nesting level. Only active parallel regions are counted. */
unsigned active_level;
[...]
And whenever #pragma omp parallel
is used, the compiler extracts the body of the parallel section into a subfunction and generates a complicated set of function calls that eventually lead to gomp_team_start()
, which contains:
#ifdef LIBGOMP_USE_PTHREADS
void
gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads,
unsigned flags, struct gomp_team *team)
{
[...]
++thr->ts.level;
if (nthreads > 1)
++thr->ts.active_level;