I'm trying to write a Motorola 68000 assembly program to sort an array of bytes in descending order using the ROL
instruction. The sorting algorithm follows the Bubble Sort approach.
Here's the code:
ORG $8000
START:
moveq.l #len-1,d7 ; Number of total passes for sorting
OUTER_LOOP:
lea.l array,a0
moveq.l #len-2,d6 ; Number of comparisons per pass
INNER_LOOP:
move.b (a0)+,d2
cmp.b (a0),d2
blt.s SWAP ; Swap if needed
NOSWAP:
subq.l #1,d6
bge.s INNER_LOOP
subq.l #1,d7
bge.s OUTER_LOOP
SIMHALT
SWAP:
move.b -1(a0),d3 ; Load the left element into d3
move.b (a0),d4 ; Load the right element into d4
lsl.w #8,d3 ; Shift left element to the upper byte: d3 = left << 8
or.w d4,d3 ; Merge left and right into d3: d3 = (left << 8) | right
rol.w #8,d3 ; Rotate left, swapping bytes: d3 = (right << 8) | left
move.w d3,d5
lsr.w #8,d5 ; Extract the new left element (was originally right)
move.b d5,-1(a0) ; Store the new left value (was right)
andi.w #$00FF,d3 ; Mask to extract the new right element (was left)
move.b d3,(a0) ; Store the new right value (was left)
addq.l #1,a0
bra.s NOSWAP
; DATA
array DC.B 5,3,8,1,7
len EQU 5
END START
I’d appreciate any insights on how to make this code faster and more elegant.
In a bubble sort, the array elements that need swapping will always be adjacent. That's something to acknowledge.
Also, once the inner loop has finished its first run, the array's minimum will be at the end of the array. The next time that the inner loop runs we need no longer look at that final element. The same is true for any of the following runs of the inner loop. It is the job of the outer loop to take care of that, simply by loading the inner loop counter with the current value of the outer loop counter.
The improved nested loops look like:
START:
moveq #len-2, d7 ; (Number of elements - 1) - 1
OUTER_LOOP:
lea.l array, a0
move.w d7, d6 ; Number of comparisons in this pass
INNER_LOOP:
move.b (a0)+, d0
cmp.b (a0), d0
bge.s SKIP
<<<<< Insert here the code that swaps two adjacent elements
SKIP:
dbra d6, INNER_LOOP
dbra d7, OUTER_LOOP
SIMHALT
The first -1 in the setup for D7 is because for N elements there can only be N-1 comparisons between them.
The second -1 in the setup for D7 comes from using the dbra
instruction that runs down to -1.
Loading D6 can be a word-sized operation knowing that dbra
only considers the low 16 bits of the data register.
Your SWAP code should be in the loop and not somewhere else requiring a bra
instruction to come back, as well as it uses too many registers and has redundant instructions:
Using ROL
is apparently required by the task description.
This is one of the many solutions that use the ROL
instruction:
rol.w #8, d0 ; Move left element to bits 8-15
move.b (a0), d0 ; Move right element to bits 0-7
move.b d0, -1(a0) ; Move right element to left position in array
rol.w #8, d0 ; Move left element to bits 0-7
move.b d0, (a0) ; Move left element to right position in array
Apart from that one silly task requirement, I’d say that this solution makes the code both faster and more elegant.
But without the restriction it's even nicer:
move.b (a0), -1(a0)
move.b d0, (a0)