Develop a Motorola 68000 program that
- Reverses the order of elements in an input array A
- Copies the result into a second array B
All logic must live in one subroutine (or more) with parameters passed via the stack.
I’ve applied everything I learned from my previous works (parameter passing, register usage, minimal stack frames, loop optimisations, etc.). Here’s my current solution:
ORG $8000
; ---------- data ----------
VectorA DC.B 1,2,3,4,5,6 ; example input
LenA EQU 6 ; number of elements
VectorB DS.B LenA ; destination buffer
; ---------- entry point ----------
START:
PEA.L VectorB ; push &B (long)
MOVE.L #LenA,-(A7) ; push length (long)
PEA.L VectorA ; push &A (long)
BSR.S ReverseCopy ; reverse into B
ADDA.L #12,A7 ; clean up 3 longs
SIMHALT ; halt simulator
; --------------------------------------------------------
; ReverseCopy(&A, length, &B)
;
; IN (stack): 4(A7)=&A, 8(A7)=length, 12(A7)=&B
; OUT: none (but B filled)
; MOD: A0, A1, D0, D1
; --------------------------------------------------------
ReverseCopy:
MOVEA.L 4(A7),A0 ; A0 = address of A
MOVE.L 8(A7),D0 ; D0 = length (32-bit)
MOVEA.L 12(A7),A1 ; A1 = address of B
BEQ.S .Done ; if length=0, nothing to do
ADD.L D0,A0 ; point A0 just past the last element
MOVE.W D0,D1 ; D1 = length as 16-bit (for DBF)
SUBQ.W #1,D1 ; DBF will run exactly length times
.Loop:
MOVE.B -(A0),(A1)+ ; copy last A into next B
DBF D1,.Loop ; decrement D1, loop if >=0
.Done:
RTS ; return to START
END START
I’ve added a dispatch routine that lets you choose between three implementations:
ROR.W #8
); --------------------------------------------------------
; Entrypoint for dispatch (invoked by BSR.S ReverseAllVariants)
; Uncomment the variant you want to use:
ReverseAllVariants:
; JMP ReverseSingleByte
; JMP ReverseDBF
JMP ReverseSwap
; -- Variant 1: 32-bit single-byte loop --------------------
ReverseSingleByte:
MOVE.L 8(A7),D0 ; D0 = length
BEQ.S .Done1 ; skip if zero
MOVEA.L 4(A7),A0 ; A0 = &A
MOVEA.L 12(A7),A1 ; A1 = &B
ADD.L D0,A0 ; A0 → end of A
1Loop:
MOVE.B -(A0),(A1)+ ; copy last byte
SUBQ.L #1,D0 ; decrement counter
BNE.S 1Loop ; until D0 == 0
.Done1:
RTS
; -- Variant 2: 68010 MOVE+DBF single-byte loop ----------
ReverseDBF:
MOVE.L 8(A7),D0 ; D0 = length
BEQ.S .Done2 ; skip if zero
MOVEA.L 4(A7),A0 ; A0 = &A
MOVEA.L 12(A7),A1 ; A1 = &B
ADD.L D0,A0 ; A0 → end of A
MOVE.W D0,D0 ; D0.W = length
SUBQ.W #1,D0 ; prepare for DBF
DBFLoop:
MOVE.B -(A0),(A1)+ ; copy last byte
DBF D0,DBFLoop ; until D0.W == -1
.Done2:
RTS
; -- Variant 3: 68000 word-swap loop ----------------------
ReverseSwap:
MOVE.L 8(A7),D0 ; D0 = length
BEQ.S .Done3 ; skip if zero
MOVE.L D0,D2 ; D2 = length copy
ANDI.L #1,D2 ; D2 = odd remainder
MOVE.L D0,D1
LSR.L #1,D1 ; D1 = word_count = len/2
MOVEA.L 4(A7),A0 ; A0 = &A
MOVEA.L 12(A7),A1 ; A1 = &B
TST.L D2 ; if odd, handle trailing byte
BEQ.S .WordLoopStart
MOVE.B -(A0),(A1)+
.WordLoopStart:
MOVE.W -(A0),D0 ; load two bytes
ROR.W #8,D0 ; swap bytes
MOVE.W D0,(A1)+ ; store swapped word
DBF D1,.WordLoopStart ; repeat
.Done3:
RTS
Let me know if anything still needs tweaking :-).
MOVEA.L 4(A7),A0 ; A0 = address of A MOVE.L 8(A7),D0 ; D0 = length (32-bit) MOVEA.L 12(A7),A1 ; A1 = address of B BEQ.S .Done ; if length=0, nothing to do
Technically this is correct because MOVEA
does not change any flags, but it will make more sense if you keep the branch instruction close to the relevant load instruction:
MOVE.L 8(A7),D0 ; D0 = length (32-bit)
BEQ.S .Done ; if length=0, nothing to do
MOVEA.L 4(A7),A0 ; A0 = address of A
MOVEA.L 12(A7),A1 ; A1 = address of B
ADD.L D0,A0 ; point A0 just past the last element
The ADD
instruction should not have allowed adding to an address register. Use ADDA
instead. Typo maybe?
MOVE.W D0,D1 ; D1 = length as 16-bit (for DBF) SUBQ.W #1,D1 ; DBF will run exactly length times
There is no reason to first copy D0 into D1. The DBF D0,...
instruction will use the low word of D0 and don't care about what's in the high word of D0. Anyway, if there would be high contents in D0 then the whole task would have failed, wouldn't it?
- Reverses the order of elements in an input array A
- Copies the result into a second array B
It is possible to interpret this task description as requiring you to reverse the A array in situ, and only then make a copy of the result.
Next is a version of the program that does exactly that:
ORG $8000
; ---------- data ----------
VectorA DC.B 1,2,3,4,5,6 ; example input
LenA EQU 6 ; number of elements
VectorB DS.B LenA ; destination buffer
; ---------- entry point ----------
START:
PEA.L VectorB ; push &B (long)
MOVE.L #LenA,-(A7) ; push length (long)
PEA.L VectorA ; push &A (long)
BSR.S ReverseAndCopy ; reverse A and copy to B
ADDA.L #12,A7 ; clean up 3 longs
SIMHALT ; halt simulator
; --------------------------------------------------------
; ReverseAndCopy(&A, length, &B)
;
; IN (stack): 4(A7)=&A, 8(A7)=length, 12(A7)=&B
; OUT: A reversed, B copy of A
; MOD: A0, A1, D0, D1
; --------------------------------------------------------
ReverseAndCopy:
MOVE.L 8(A7),D0 ; D0 = length (32-bit)
BEQ.S .Done ; if length=0, nothing to do
MOVEA.L 4(A7),A0 ; A0 = address of first element A
LEA.L (A0,D0.L),A1 ; A1 = address after last element A
LSR.L #1,D0 ; D0 = number of pairs of elements
BEQ.S .Copy ; length is 1
; 1. Reverses the order of elements in input array A
.Loop1:
MOVE.B -(A1),D1
MOVE.B (A0),(A1)
MOVE.B D1,(A0)+
SUBQ.L #1,D0
BNE.S .Loop1
; 2. Copies the result into the array B
.Copy:
MOVEA.L 4(A7),A0 ; A0 = address of A
MOVEA.L 12(A7),A1 ; A1 = address of B
MOVE.L 8(A7),D0 ; D0 = length (32-bit)
.Loop2:
MOVE.B (A0)+,(A1)+
SUBQ.L #1,D0
BNE.S .Loop2
.Done:
RTS
END START
SUBQ
BNE
by DBRA
. Remember to decrement the counter beforehand because DBRA
runs until -1.The Variant 2: "MOVE+DBF single-byte loop" is not specifically 68010. Works fine on 68000. Note that there still is that silly MOVE.W D0,D0
present.
The Variant 3: "word-swap loop" will have trouble with an odd length because words in memory must reside on an even address. The only possible good outcome (for an odd length) is if array A has an even address and array B has an odd address. And I find that unlikely in practice... Also the loop will always be running one time too many! See Tip 1 above. Additionally, in case of length=1, the loop shouldn't even run at all!
If you want to move to better performing variants, then imposing some sensible restrictions is needed. e.g. both arrays begin at an even address and the array length is even (at least for byte-sized elements).