Push all and pop all in emu8086

I'm looking for an easy way of pusha and popa in my assembly program. I want to use emu8086 to find bugs in my programs so .286 is not allowed.

I tried:

push_a proc
push ax
push cx
push dx
push bx
push sp
push bp
push si
push di
push_a endp

pop_a proc
pop di
pop si
pop bp
pop sp
pop bx
pop dx
pop cx
pop ax
pop_a endp

But it doesn't work, thus when we call push_a we push to the stack the address where we are now.
Is there another easy, simple way? I don't want to write eight push and eight pop every time.

Solution

One could implement the procedures as macros

pusha MACRO
   push ax
   push cx
   push dx
   push bx
   push sp
   push bp
   push si
   push di
pusha ENDM

popa MACRO
   pop di
   pop si
   pop bp
   pop sp
   pop bx
   pop dx
   pop cx
   pop ax
popa ENDM

but this is not the architectural behaviour of pusha/popa and it is broken code on a real 8086.

The real semantic of `pusha`

The push sp in pusha should push the state of the stack pointer at the start of the pusha macro.

Temp ← (SP);
Push(AX);
Push(CX);
Push(DX);
Push(BX);
Push(Temp);
Push(BP);
Push(SI);
Push(DI);

^{Intel pseudo-code for pusha - Intel Manual 2B}

Using a scratch memory location

You can do something similar if you have a scratch memory location:

pusha MACRO
   mov WORD PTR [pusha_scratch], sp
   push ax
   push cx
   push dx
   push bx
   push WORD PTR [pusha_scratch]
   push bp
   push si
   push di 
pusha ENDM

Making sure that the pusha_scratch is accessible through DS under any context can be quite of an endeavour.
Not accounting for a separate definition of pusha_scratch.

Using a position independent, in-place, version

Alternatively one could account for the modified value of sp with something that is position independent and operates in-place.
If I did my math correctly - double check it - this code should do the trick

pusha MACRO
   push ax
   push cx
   push dx
   push bx

   push bp                  ;Top of stack = BP
   mov bp, sp               ;BP points to TOS
   lea bp, [bp+0ah]         ;BP is equal to the starting value of SP
   xchg bp, [bp-0ah]        ;BP = Original BP, [SP] = Starting value of BP

   push bp
   push si
   push di
pusha ENDM

^{Many thanks to Fifoernik for correcting the broken code by noting the inverted math and preventing the altering of the EFLAGS register.}

The nice thing about the snippet above is that it avoids using push sp entirely as it is a bit problematic when it comes to backward compatibility.

The problem of `push sp`

On a real 8086 the instruction push sp behaves differently:

The P6 family, Pentium, Intel486, Intel386, and Intel 286 processors push a different value on the stack for a PUSH SP instruction than the 8086 processor.
The 32-bit processors push the value of the SP register before it is decremented as part of the push operation;
the 8086 processor pushes the value of the SP register after it is decremented.

^{Intel compatibility section - Intel Manual 3 - 22.17}

I don't know how accurate emu8086 is as an emulator of the 8086 but I would avoid push sp.

The semantic of `popa`

The peculiar way pusha is implemented calls for a peculiar implementation of popa.
On a real 8086 a naive list of pops breaks as soon as pop sp is executed.

Indeed the CPU exploits the fact that you can't really modify sp even after a pusha - otherwise there would be no way to get those values back - and notices that after all the pops the original value is implicitly restored.

DI ← Pop();
SI ← Pop();
BP ← Pop();
Increment ESP by 2; (* Skip next 2 bytes of stack *)
BX ← Pop();
DX ← Pop();
CX ← Pop();
AX ← Pop();

^{Intel pseudo-code for popa - Intel Manual 2B}

So a more correct implementation of popa is

popa MACRO
   pop di
   pop si
   pop bp
   add sp, 02h         ;Beware, this affects EFLAGS
   pop bx
   pop dx
   pop cx
   pop ax
popa ENDM

Note that you must avoid pop sp in a real 8086 as a push sp followed by a pop sp is not idempotent, for

The POP ESP instruction increments the stack pointer (ESP) before data at the old top of stack is written into the destination.

^{Intel description for pop - Intel Manual 2B}

Finally, one would expect pusha/popa to be atomic with respect to interrupts, so a cli/sti pair is needed around the macro bodies.

If you use pusha/popa just as a shorthand for saving all the registers, then all the hassle can be skipped and a sufficient implementation is

 push_all MACRO
   push ax
   push cx
   push dx
   push bx
   ;NOTE: Missing SP
   push bp
   push si
   push di
push_all ENDM

pop_all MACRO
   pop di
   pop si
   pop bp
   ;NOTE: Missing SP
   pop bx
   pop dx
   pop cx
   pop ax
pop_all ENDM

Push all and pop all in emu8086

The real semantic of pusha

The problem of push sp

The semantic of popa

The real semantic of `pusha`

The problem of `push sp`

The semantic of `popa`