assemblyx86-16stack-memorystack-frameframe-pointer

Is there a way to use popa/pusha without SP? (for procedures with BP)


for example:

var1 dw 8
var2 dw 1
res dw ?


CODESEG
proc plus
pusha
mov bp,sp


    mov ax, [bp+6];var1
    mov bx, [bp+4];var2
    add ax, bx
    mov [res], ax
    

popa
ret 4
endp plus


start :
    mov ax, @data
    mov ds, ax
    
    push [var1]
    push [var2]
    call plus
    
    mov dl, [byte ptr res]
    add dl,30h
    mov ah,2h
    int 21h

this procedure won't work as well and I understood it has something to do with pushing and popping SP in the command pusha/popa and then it messes up the command-

mov bp,sp

and my question is, is there a way to use pusha/popa without SP? or should I stick to pushing and popping without those commands?


Solution

  • pusha unavoidably changes SP by 8x 2 (or 8x 4 in 32-bit mode) and stores to the stack. If you don't like that, don't use it; it's not efficient anyway, only good for code size. (Especially if you don't need to actually save all 8 registers including SP). e.g. you could use push bx after a normal BP setup, and also push ax if you want to save/restore it for some reason, even though your caller needs to overwrite parts of AX right away.

    It would be inefficient to save BP twice, but you could push bp / mov bp, sp / pusha if you really want, so BP would be pointing to the normal position relative to the return address and stack args. If you want slow but simplistic, that's the obvious way to do it.

    Or as Jester says, simply account for the different offset when calculating the distance, like [bp+14 + 4]. Or, after pusha / mov bp,sp, you could add bp, 14.
    (If you could assume 386 features, lea bp, [esp+14] could replace the mov+add. I'm still assuming 16-bit mode, so you're only going to use 16-bit BP, not EBP. If SP is correctly zero-extended into ESP (which is a good idea), you could omit the frame pointer entirely, mov ax, [esp+14 + 4].)

    [bp+14] is the stack slot right below the return address, so that +4 and +6 are the first two words on the stack above the return address, the same offsets you'd use relative to a frame pointer pointing at the traditional place.

    Having BP not pointing to the saved-BP will break stack unwinding (backtraces) that follow the linked-list of BP / return-value pairs that the traditional frame-pointer setup creates, if that matters to you. If you consistently do this in all your functions, you could have a custom stack-unwind script in GDB or something that offsets a saved-BP to get to the next saved-BP.


    Of course for your tiny function, you only need BP and one temp register for mov ax, [bp + ...] / add ax, [bp + ...], and normally you leave some call-clobbered registers you can use without saving/restoring so you can write small functions without the inefficiency of pusha/popa. Saving and restoring everything in every function is slow.

    Even better, pass two args in registers so you don't need to get them off the stack, and thus don't need to set up BP.