[SOLVED] Is my MIPS compiler crazy, or am I crazy for choosing MIPS?

Is my MIPS compiler crazy, or am I crazy for choosing MIPS?

I am using a MIPS CPU (PIC32) in an embedded project, but I am starting to question my choice. I understand that a RISC CPU like MIPS will generate more instructions than one might expect, but I didn't think it would be like this. Here is a snippet from the disassembly listing:

225:                         LATDSET = 0x0040;
    sw          s1,24808(s2)
    sw          s4,24808(s2)
    sw          s4,24808(s2)
    sw          s1,24808(s2)
    sw          s4,24808(s3)
    sw          s4,24808(s3)
    sw          s1,24808(s3)

226:                         {

227:                             porte = PORTE;
    lw          t1,24848(s4)
    andi        v0,t1,0xffff
    lw          v1,24848(s6)
    andi        ra,v1,0xffff
    lw          v1,24848(s6)
    andi        ra,v1,0xffff
    lw          v0,24848(s6)
    andi        t2,v0,0xffff
    lw          a2,24848(s5)
    andi        v1,a2,0xffff
    lw          t2,24848(s5)
    andi        v1,t2,0xffff
    lw          v0,24848(s5)
    andi        t2,v0,0xffff

228:                             if (porte & 0x0004)
    andi        t2,v0,0x4
    andi        s8,ra,0x4
    andi        s8,ra,0x4
    andi        ra,t2,0x4
    andi        a1,v1,0x4
    andi        a2,v1,0x4
    andi        a2,t2,0x4

229:                                 pst_bytes_somi[0] |= sliding_bit;
    or          t3,t4,s0
    xori        a3,t2,0x0
    movz        t3,s0,a3
    addu        s0,t3,zero
    or          t3,t4,s1
    xori        a3,s8,0x0
    movz        t3,s1,a3
    addu        s1,t3,zero
    or          t3,t4,s1
    xori        a3,s8,0x0
    movz        t3,s1,a3
    addu        s1,t3,zero
    or          v1,t4,s0
    xori        a3,ra,0x0
    movz        v1,s0,a3
    addu        s0,v1,zero
    or          a0,t4,s2
    xori        a3,a1,0x0
    movz        a0,s2,a3
    addu        s2,a0,zero
    or          t3,t4,s2
    xori        a3,a2,0x0
    movz        t3,s2,a3
    addu        s2,t3,zero
    or          v1,t4,s0
    xori        a3,a2,0x0
    movz        v1,s0,a3

This seems like a crazy number of instructions for simple reading / writing and testing variables at fixed addresses. On a different CPU, I could probably get each C statement down to about 1..3 instructions, without resorting to hand-written asm. Obviously the clock rate is fairly high, but it's not 10x higher than what I would have in a different CPU (e.g. dsPIC).

I have optimisation set to maximum. Is my C compiler terrible (It's gcc 3.4.4)? Or is this typical of MIPS?

Solution

Finally figured out the answer. The disassembly listing is totally misleading. The compiler is doing loop unrolling, and what we're seeing under each C statement is actually 8x the number of instructions, because it's unrolling the loop 8x. The instructions are not at consecutive addresses! Turning off loop unrolling in the compiler options produces this:

225:                         LATDSET = 0x0040;
    sw          s3,24808(s2)
226:                         {
227:                             porte = PORTE;
    lw          t1,24848(s5)
    andi        v0,t1,0xffff
228:                             if (porte & 0x0004)
    andi        t2,v0,0x4
229:                                 pst_bytes_somi[0] |= sliding_bit;
    or          t3,t4,s0
    xori        a3,t2,0x0
    movz        t3,s0,a3
    addu        s0,t3,zero
230:

Panic over everyone.