I'm trying to profile a C function (which is called from an interrupt, but I can extract it and profile it elsewhere) on a Cortex-M4.
What are the possibilities to count the number of cycles typically used in this function?
The function shall run in ~4000 cycles tops, so RTC isn't an option I guess, and manually counting cycles from disassembly can be painful and is only useful if averaged, because I'd like to profile on a typical stream with a typical flash / memory usage pattern.
I have heard about cycle counter registers and MRC instructions, but they seem to be available for A8/11. I haven't seen such instructions in Cortex-Mx microcontrollers.
Take a look at the DWT_CYCCNT register defined here. Note that this register is implementation-dependent. Who is the chip vendor? I know the STM32 implementation offers this set of registers.
This post provides instructions for using the DWT Cycle Counter Register for timing. (See the post form 11 December 2009 - 06:29 PM)
This Stack overflow post is an example on how to DWT_CYCCNT as well.