I'm writing a program in M68K assembly language to find the minimum value in an array.
Here's my code:
ORG $8000
; DATA
array DC.B 2,3,1,4,5 ; The minimum is 1
len EQU 5
min DC.L 0 ; Stores the found minimum
START:
lea.l array,a0
move.b (a0)+,d0 ; d0 = 2 (current minimum), a0 now points to 3
moveq.l #len-1,d1 ; One element already used
LOOP:
cmpi.l #0,d1 ; If d1 = 0, no more elements to check
beq.s END ; Jump to the end if finished
move.b (a0)+,d2 ; Load next element into d2
; d2 = 3, a0 -> 1
; d2 = 1, a0 -> 4
; d2 = 4, a0 -> 5
; d2 = 5, a0 -> ??? (What happens here?)
cmp.b d0,d2
blt.s UPDATE ; If d2 < d0, update minimum
bra.s DECREMENT
UPDATE:
move.b d2,d0 ; Update the minimum
DECREMENT:
subq.l #1,d1 ; Decrease counter
bra.s LOOP
END:
move.b d0,min ; Store the minimum
SIMHALT
END START
???
, what happens to a0
after processing the last element 5? Does it point to an invalid address?d0
with move.b (a0)+,d0
a good way to start, or is there a better way?Any suggestions would be greatly appreciated! Thanks!
- At the point where I marked
???
, what happens toa0
after processing the last element 5? Does it point to an invalid address?
Your min variable follows the array directly, so A0 will point at the first byte of the min longword variable.
- Is my approach of initializing
d0
withmove.b (a0)+,d0
a good way to start, or is there a better way?
That's fine.
- Can my code be optimized further?
Sure!
No need to test for a zero counter at the top of the loop. You could test it before the loop though as a precaution. The decision about iterating some more should be taken below:
subq.l #1, d1 ; Decrease counter
bne.s LOOP
Avoid as much as possible to have to BRA
. If D2 < D0 don't branch to UPDATE, but rather bypass based on the opposite condition which is bge
:
cmp.b d0, d2
bge.s SKIP ; If d2 >= d0, skip updating the minimum
move.b d2, d0 ; Update the minimum
SKIP:
ORG $8000
; DATA
array DC.B 2,3,1,4,5 ; The minimum is 1
len EQU 5
min DC.L 0 ; Stores the found minimum
START:
lea.l array, a0
move.b (a0)+, d0 ; d0 = 2 (current minimum), a0 now points to 3
moveq.l #len-1, d1 ; One element already used
; Could bail-out here if D1 is zero
LOOP:
move.b (a0)+, d2 ; Load next element into d2
cmp.b d0, d2
bge.s SKIP ; If d2 >= d0, skip updating the minimum
move.b d2, d0 ; Update the minimum
SKIP:
subq.l #1, d1 ; Decrease counter
bne.s LOOP
END:
move.b d0, min ; Store the minimum
SIMHALT
END START
You could define min a byte since you're dealing with bytes all the way.
[edit 1]
It was Erik Eidt that noticed the problem with writing to the wrong part of the longword variable.
.EVEN
directive (if available).move.b d0, min
you would be storing the value in the byte that has the most weight. Inevitably, this produces the wrong result. The solution is to store a full longword, and because your array is supposed to contain signed bytes, using the ext
instruction can solve this: ext.w d0
first sign-extends from byte to word and afterwards ext.l d0
sign-extends from word to longword.Of course none of this additional complexity would have existed in case of a byte-sized min variable.
The revised program:
ORG $8000
; DATA
array DC.B 2,3,1,4,5 ; The minimum is 1
len EQU 5
DC.B 0 ; Filler (You should look for the .EVEN directive)
min DC.L 0 ; Stores the found minimum
START:
lea.l array, a0
move.b (a0)+, d0 ; d0 = 2 (current minimum), a0 now points to 3
moveq.l #len-1, d1 ; One element already used
; Could bail-out here if D1 is zero
LOOP:
move.b (a0)+, d2 ; Load next element into d2
cmp.b d0, d2
bge.s SKIP ; If d2 >= d0, skip updating the minimum
move.b d2, d0 ; Update the minimum
SKIP:
subq.l #1, d1 ; Decrease counter
bne.s LOOP
END:
ext.w d0
ext.l d0
move.l d0, min ; Store the minimum
SIMHALT
END START
[edit 2]
DBRA
Decrement and BRAnch
The 68000 has the dbra
instruction that decrements a word-sized count (taken from any of its 8 data registers), and that performs the branching only if the decrement operation did not result in storing -1 to the concerned data register. So this is a good replacement for the usual combination of an instruction that decrements some data register followed by an instruction that conditionally branches.
subq.l #1, d0 ; 8 clocks
bne.s LOOP ; 10 clocks taken / 8 clocks not taken
subq.w #1, d0 ; 4 clocks
bne.s LOOP ; 10 clocks taken / 8 clocks not taken
dbra d0, LOOP ; 10 clocks taken / 14 clocks not taken
The dbra
instruction is not particularly very beginner-friendly, because it is easily forgotten that the count must be a 16-bit word and that the count will end at -1 instead of the often expected 0.
DBcc
Decrement and Branch, unless condition is true
The dbra
instruction (through its synonym dbf
) is actually part of a wider group of instructions: dbt
, dbf
, dbhi
, dbls
, dbcc
, dbcs
, dbne
, dbeq
, dbvc
, dbvs
, dbpl
, dbmi
, dbge
, dblt
, dbgt
, dble
.
If the condition code that is mentioned in the mnemonic evaluates to true, then the 'decrement and branch' operation is skipped entirely and the execution continues with the following instruction. Normally this translates to exiting from a loop.
If on the other hand the condition is not fulfilled, then the word-sized count (held in any of the 8 data registers) gets decremented, and if this results in storing -1 to the concerned data register, the branch is not taken. Again, normally this translates to exiting from a loop. Obviously, any value other than -1 will branch, so staying in the loop.
Two intriguing special cases:
dbf
the condition (which is 'false') can never become true. Therefore the 'decrement and branch' operation is never going to be skipped. So we're left with what dbra
does. But don't overthink this or you'll soon realise that it's a chicken-or-the-egg situation. For readabilities sake, use dbra
.dbt
the condition (which is 'true') is always true. Since then the 'decrement and branch' operation is skipped entirely, all this instruction really does is spend some time. You can safely use it as a 4-byte alternative for the 2-byte nop
instruction, but then an alternative that has 524288 unique encodings! Need some obfuscating?Below is a version of the program that is heavily relying on the dbra
and dbgt
instructions.
From personal experimentation I found that looking for the min/max in an array with randomly distributed small numbers (game scores I collected over the years), requires few updates to the current min/max as maintained by the loop. Consider a 10-byte array where the LOOP will have to run 9 times, well, an update to the current minimum on average would occur not even 2 times. Another point to consider is whether the minimum is located in the last array element or not. Again my data indicates this happens seldomly.
Thus I reasonably assert: the minimum is not located at the end, and the fast loop has to exit 2 times. The new code is 25% faster at a cost of requiring 4 extra bytes. For a byte-sized array that has 10 elements, the previous loop ran at 326 clocks and this loop now runs at 240 clocks. If ever the minimum would happen to reside in the last array element, then it would run at 220 clocks.
ORG $8000
array DC.B 6,11,2,2,9,1,5,5,4,5
len EQU 10
min DC.B 0
START:
lea.l array, a0
move.b (a0)+, d0 ; First array element becomes current minimum
moveq #len-2, d1 ; Number of remaining array elements - 1
LOOP:
cmp.b (a0)+, d0 ; Check for a better min
dbgt d1, LOOP ; Most often this does branch
ble.s DONE ; GT is false, so this must be D1 == -1
move.b -1(a0), d0 ; GT is true, Set a new current minimum
dbra d1, LOOP ; Decrementing D1 desperately needed
DONE:
move.b d0, min ; D1 == -1
SIMHALT
END START