I am using perf_event_open
in my c profiling app to leverage perf in getting event data. In order to improve performance, I am reading the hardware registers directly by following the Perf Userspace PMU Hardware Counter Access documentation to read the PMU register directly using the mrs
instruction.
I use the following code:
static struct perf_event_attr attr;
attr.type = PERF_TYPE_HARDWARE;
attr.config = PERF_COUNT_HW_CPU_CYCLES;
attr.exclude_kernel = 1;
attr.exclude_hv = 1;
attr.config1 = 3; // user access enabled
int fd = syscall(SYS_perf_event_open, &attr, 0, -1, -1, 0);
ioctl(fd, PERF_EVENT_IOC_RESET, 0);
ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
// Code where we want to measure performance. At certain points we call read_register_directly()
ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
close(fd);
uint64_t read_register_directly() {
uint64_t value = 0;
asm volatile("mrs %0, PMCCNTR_EL0 " : "=r" (value));
return value;
}
The above code to read the register directly works properly with the perf configuration. The problem is that after ~25 reads of the register I am getting an "illegal instruction" error, though I'm not sure why.
I looked through the ARM docs for PMCCNTR_EL0 and some other resources but I haven't found anything which explains this illegal instruction error.
This was ultimately due to the attributes passed in the perf_event_open
system call. The thread that made the system call was able to read the register directly, but other threads were resulting in "Illegal Instruction" error.
There is a perf_event_attr
flag inherit
which allows the user to profile all threads in a process instead of just the thread executed by perf_event_open
. So in the above code I added two things to fix the code flow:
attr.inherit = 1;
asm volatile("isb;");
after reading the PMU register