riscvchiselverilatorrocket-chip

Rocket chip simulation shows unexpected instruction count


The following two code snippets differ only the value loaded into the x23 register, but the minstret instruction counts (reported by a Verilator simulation of the Rocket chip) differ substantially. Is this a bug, or am I doing something wrong?

The read_csr() function is from the RISC-V Frontend Server Library (https://github.com/riscv/riscv-fesvr/blob/master/fesvr/encoding.h), and the rest of the code [syscalls.c, crt.S, test.ld] is similar to the RISC-V benchmarks (https://github.com/riscv/riscv-tests/tree/master/benchmarks/common).

I have checked that the compiled binaries contain the exact same instructions, except for the difference in the operands.

Dividing 0x0fffffff by 0xff, repeating 1024 times: 3260 instructions.

size_t instrs = 0 - read_csr(minstret);

asm volatile (
        "mv             x20,    zero;"
        "li             x21,    1024;"
        "li             x22,    0xfffffff;"
        "li             x23,    0xff;"

    "loop:"
        "div            x24,  x22,  x23;"
        "addi           x20,  x20,  1;"
        "bleu           x20,  x21,  loop;"

    ::: "x20", "x21", "x22", "x23", "x24", "cc"
);

instrs += read_csr(minstret);

Dividing 0x0fffffff by 0xffff, repeating 1024 times: 3083 instructions.

size_t instrs = 0 - read_csr(minstret);

asm volatile (
        "mv             x20,    zero;"
        "li             x21,    1024;"
        "li             x22,    0xfffffff;"
        "li             x23,    0xffff;"

    "loop:"
        "div            x24,  x22,  x23;"
        "addi           x20,  x20,  1;"
        "bleu           x20,  x21,  loop;"

    ::: "x20", "x21", "x22", "x23", "x24", "cc"
);

instrs += read_csr(minstret);

Here, 3083 instructions seems correct (1024 * 3 = 3072). Since minstret counts retired instructions, it seems strange that first example executed ~200 more instructions. These results are always the same no matter how many times I run these two programs.


Solution

  • The problem was resolved at https://github.com/freechipsproject/rocket-chip/issues/1495.

    Servicing the debug interrupt, which is apparently used by the simulation to know whether the benchmark has finished executing, caused the differences in the instruction count. The verbose log produced by Verilator shows the debug address range (0x800 onwards) being injected at different points during the execution.