assemblymotorola68000

68000 Assembly - String Concatenation Subroutine


I'm writing a subroutine in Assembly for the Motorola 68000 that concatenates two strings. The subroutine receives two input strings, StringA ("Hello") and StringB ("World"), and stores the concatenated result in StringC ("HelloWorld").

The code compiles without errors and seems to work, but I am not sure if the output is correct or if the implementation is logically correct.

I wrote the following code:

    ORG $8000
     
StringA DC.B 'Hello',0    ; First string with null terminator
StringB DC.B 'World',0    ; Second string with null terminator
StringC DS.B 20           ; Buffer for the concatenated string (large enough?)

START: 
      lea.l StringA,a0    ; a0 -> "Hello"
      lea.l StringB,a1    ; a1 -> "World"
      lea.l StringC,a2    ; a2 -> Buffer for concatenation
      clr.b d0            

      jsr CopyA           ; Call first subroutine 
      
      SIMHALT             

CopyA: 
      move.b (a0)+,d0     ; Load character from StringA into d0
                          ; Check if it is the null terminator
      beq.s CopyB         ; If yes, start copying StringB
      move.b d0,(a2)+     ; Otherwise, copy character into StringC
      bra CopyA           
      
CopyB:
      move.b (a1)+,d0     ; Load character from StringB into d0
      move.b d0, (a2)+    ; Copy it into StringC
      bne CopyB           ; If the character is not null, continue copying
     
     rts                  ; Return from subroutine
    
     END START

Questions:


Solution

  • The move instruction on 68k is particularly powerful.  They have given the move instruction the fewest opcode bits, reserving many bits of the 16-bit instruction word so that it can supports two <ea> operands.  In 68k, an <ea> operand (effective address) takes 6 bits, so having two of them means 12 bits, leaving only 4 bits left for other things.  Given the size selection taking 2 bits (byte/word/long), there's a whopping 2 bits left for the move opcode itself.

    Since the designers squeezed two <ea> operands into this instruction, this means that move can move memory to memory while also setting the condition codes!

    Compare this to add and most other 68k instruction: these only supports one <ea> operand and a data (or address) register or an immediate as the other operand.

    Thus, string copying should be done using an instruction like move (a0)+, (a2)+¹.

    Since this instruction sets the condition codes, you can follow it directly with a branch on condition (e.g. backwards on non-zero/non-null).

    The algorithm will have to change just slightly (as illustrated in suggestions by @SepRoland), because using this move form, the null byte from the first string will be copied into the target buffer, so have to back up a2 by 1 byte before copying the 2nd string (but this time you do want the null copied to terminate string C so move (a1)+,(a2)+ is perfect and will not require any adjustment).


    ¹There is also the dbcc instruction, that can be used to form a two instruction loop that moves memory limited by both count and also null terminator — these will run particularly fast on 68010 as it has a mode for dbcc branching backwards by one instruction to run without instruction fetch.  (Later 68ks have a more general instruction cache so can do larger loops all without instruction fetch.)