powerpcperfoprofilepapi

What is the difference between PM_DATA_ALL* and PM_DATA* events on Power8?


During evaluation of memory performance of Power8 processor using perf I ended up with problem of understanding difference between events PM_DATA_ALL_* and PM_DATA_*. Most of the counters exists in both version, but the description in oprofile documentation and in papi_native_avail are the same, for example:

PM_DATA_FROM_LMEM

The processor's data cache was reloaded from the local chip's Memory due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.

I though I will figure out the difference by measuring some data. If I provide task large enough, I can observe expected difference that *_ALL versions have higher values. I understand the concept of multiplexing counters in the measure using perf.

So what is actually the all in these events?


Solution

  • After few more hours of searching, I found another source directly from IBM describing the events as:

    PM_DATA_ALL_FROM_LMEM

    The processor's data cache was reloaded from the local chip's Memory due to either demand loads or data prefetch

    and

    PM_DATA_FROM_LMEM

    The processor's data cache was reloaded from the local chip's Memory due to a demand load

    So the difference makes prefetch load, which is not included in the second version.

    The PAPI and perf tools just include wrong description. These events were contributed directly to oprofile by IBM but probably with some mistakes/inaccuracies. As I browse through the PAPI/libpfm source, I see that the correct description is in .pme_short_desc field, but the .pme_long_desc fields are both the same. And papi_native_avail reports only the long one: Thanks ... Very fu**ing useful!

    Thanks for patience. Summing the stuff like this helped me a lot and I hope it will help somebody struggling with similar issues.