dosdisassemblyida16-bit

why is there a "RETF 4" at the end of the disassembled function?


im disassembling the old 1989 Borland tool TDSTRIP.EXE that can extract Turbo Debugger information from executables and stumpled over this path-normalizing function

this is the signature im using extern "C" void far maybe_lib_sub_103FC(char far* dest, const char far* src);

i think it is not a __cdecl because the routine seems to cleanup the stack on its own and the calls to the routine do not cleanup the stack

i copied the disassembly and re-assembled it binary equal and calling it from a Borland C++ 5.02, DOS, small memory model, testprogram

the code runs only as expected when i replace the last retf 4 with just retf and i don't understand why 4 bytes - to my understanding there are 2 far-ptrs pushed onto the stack so wouldn't be 8 more correct? or is that only for local variables stack cleanup, but i can't see any local variables

calling looks like this

cmp word ptr [si+2], 0
jz  short loc_13478
; ----
; calling starts
push    ds
push    word ptr [si+2] ; src
push    ss
lea ax, [bp+s1]
push    ax
call    maybe_lib_sub_103FC
; after call no sp adjust
; ----
lea ax, [bp+var_52]
push    ax
lea ax, [bp+s1]
push    ax      ; s1
call    _strcmp

this is the routine

seg000:03FC                         _maybe_lib_sub_103FC proc far            ; CODE XREF: MAYBE_MAIN_sub_133D6+61P
seg000:03FC                                                                 ; MAYBE_MAIN_sub_133D6+7BP
seg000:03FC
seg000:03FC                         arg_0           = dword ptr  6
seg000:03FC                         arg_4           = dword ptr  0Ah
seg000:03FC
seg000:03FC 55                                      push    bp
seg000:03FD 8B EC                                   mov     bp, sp
seg000:03FF 1E                                      push    ds
seg000:0400 56                                      push    si
seg000:0401 57                                      push    di
seg000:0402 FC                                      cld
seg000:0403 C4 7E 0A                                les     di, [bp+arg_4]
seg000:0406 32 C0                                   xor     al, al
seg000:0408 B9 FF FF                                mov     cx, 0FFFFh
seg000:040B F2 AE                                   repne scasb
seg000:040D F7 D1                                   not     cx
seg000:040F 49                                      dec     cx
seg000:0410 C5 76 0A                                lds     si, [bp+arg_4]
seg000:0413 03 CE                                   add     cx, si
seg000:0415 C4 7E 06                                les     di, [bp+arg_0]
seg000:0418 AD                                      lodsw
seg000:0419 3B F1                                   cmp     si, cx
seg000:041B 77 11                                   ja      short loc_1042E
seg000:041D 80 FC 3A                                cmp     ah, ':'
seg000:0420 75 0C                                   jnz     short loc_1042E
seg000:0422 3C 61                                   cmp     al, 61h ; 'a'
seg000:0424 72 12                                   jb      short loc_10438
seg000:0426 3C 7A                                   cmp     al, 7Ah ; 'z'
seg000:0428 77 0E                                   ja      short loc_10438
seg000:042A 2C 20                                   sub     al, 20h ; ' '
seg000:042C EB 0A                                   jmp     short loc_10438
seg000:042E                         ; ---------------------------------------------------------------------------
seg000:042E
seg000:042E                         loc_1042E:                              ; CODE XREF: maybe_lib_sub_103FC+1Fj
seg000:042E                                                                 ; maybe_lib_sub_103FC+24j
seg000:042E 4E                                      dec     si
seg000:042F 4E                                      dec     si
seg000:0430 B4 19                                   mov     ah, 19h
seg000:0432 CD 21                                   int     21h             ; DOS - GET DEFAULT DISK NUMBER
seg000:0434 04 41                                   add     al, 41h ; 'A'
seg000:0436 B4 3A                                   mov     ah, 3Ah ; ':'
seg000:0438
seg000:0438                         loc_10438:                              ; CODE XREF: maybe_lib_sub_103FC+28j
seg000:0438                                                                 ; maybe_lib_sub_103FC+2Cj ...
seg000:0438 AB                                      stosw
seg000:0439 3B F1                                   cmp     si, cx
seg000:043B 74 05                                   jz      short loc_10442
seg000:043D 80 3C 5C                                cmp     byte ptr [si], 5Ch ; '\'
seg000:0440 74 28                                   jz      short loc_1046A
seg000:0442
seg000:0442                         loc_10442:                              ; CODE XREF: maybe_lib_sub_103FC+3Fj
seg000:0442 2C 40                                   sub     al, 40h ; '@'
seg000:0444 8A D0                                   mov     dl, al
seg000:0446 B0 5C                                   mov     al, 5Ch ; '\'
seg000:0448 AA                                      stosb
seg000:0449 56                                      push    si
seg000:044A 1E                                      push    ds
seg000:044B B4 47                                   mov     ah, 47h ; 'G'
seg000:044D 8B F7                                   mov     si, di
seg000:044F 06                                      push    es
seg000:0450 1F                                      pop     ds
seg000:0451 CD 21                                   int     21h             ; DOS - 2+ - GET CURRENT DIRECTORY
seg000:0451                                                                 ; DL = drive (0=default, 1=A, etc.)
seg000:0451                                                                 ; DS:SI points to 64-byte buffer area
seg000:0453 1F                                      pop     ds
seg000:0454 5E                                      pop     si
seg000:0455 72 13                                   jb      short loc_1046A
seg000:0457 26 80 3D 00                             cmp     byte ptr es:[di], 0
seg000:045B 74 0D                                   jz      short loc_1046A
seg000:045D 51                                      push    cx
seg000:045E B9 FF FF                                mov     cx, 0FFFFh
seg000:0461 32 C0                                   xor     al, al
seg000:0463 F2 AE                                   repne scasb
seg000:0465 4F                                      dec     di
seg000:0466 B0 5C                                   mov     al, 5Ch ; '\'
seg000:0468 AA                                      stosb
seg000:0469 59                                      pop     cx
seg000:046A
seg000:046A                         loc_1046A:                              ; CODE XREF: maybe_lib_sub_103FC+44j
seg000:046A                                                                 ; maybe_lib_sub_103FC+59j ...
seg000:046A 2B CE                                   sub     cx, si
seg000:046C F3 A4                                   rep movsb
seg000:046E 32 C0                                   xor     al, al
seg000:0470 AA                                      stosb
seg000:0471 C5 76 06                                lds     si, [bp+arg_0]
seg000:0474 46                                      inc     si
seg000:0475 8B FE                                   mov     di, si
seg000:0477
seg000:0477                         loc_10477:                              ; CODE XREF: maybe_lib_sub_103FC+8Fj
seg000:0477 AC                                      lodsb
seg000:0478 0A C0                                   or      al, al
seg000:047A 74 11                                   jz      short loc_1048D
seg000:047C 3C 5C                                   cmp     al, 5Ch ; '\'
seg000:047E 74 0D                                   jz      short loc_1048D
seg000:0480 3C 61                                   cmp     al, 61h ; 'a'
seg000:0482 72 06                                   jb      short loc_1048A
seg000:0484 3C 7A                                   cmp     al, 7Ah ; 'z'
seg000:0486 77 02                                   ja      short loc_1048A
seg000:0488 2C 20                                   sub     al, 20h ; ' '
seg000:048A
seg000:048A                         loc_1048A:                              ; CODE XREF: maybe_lib_sub_103FC+86j
seg000:048A                                                                 ; maybe_lib_sub_103FC+8Aj ...
seg000:048A AA                                      stosb
seg000:048B EB EA                                   jmp     short loc_10477
seg000:048D                         ; ---------------------------------------------------------------------------
seg000:048D
seg000:048D                         loc_1048D:                              ; CODE XREF: maybe_lib_sub_103FC+7Ej
seg000:048D                                                                 ; maybe_lib_sub_103FC+82j
seg000:048D 81 7D FE 5C 2E                          cmp     word ptr [di-2], 2E5Ch
seg000:0492 75 04                                   jnz     short loc_10498
seg000:0494 4F                                      dec     di
seg000:0495 4F                                      dec     di
seg000:0496 EB 1C                                   jmp     short loc_104B4
seg000:0498                         ; ---------------------------------------------------------------------------
seg000:0498
seg000:0498                         loc_10498:                              ; CODE XREF: maybe_lib_sub_103FC+96j
seg000:0498 81 7D FE 2E 2E                          cmp     word ptr [di-2], 2E2Eh
seg000:049D 75 15                                   jnz     short loc_104B4
seg000:049F 80 7D FD 5C                             cmp     byte ptr [di-3], 5Ch ; '\'
seg000:04A3 75 0F                                   jnz     short loc_104B4
seg000:04A5 83 EF 03                                sub     di, 3
seg000:04A8 80 7D FF 3A                             cmp     byte ptr [di-1], 3Ah ; ':'
seg000:04AC 74 06                                   jz      short loc_104B4
seg000:04AE
seg000:04AE                         loc_104AE:                              ; CODE XREF: maybe_lib_sub_103FC+B6j
seg000:04AE 4F                                      dec     di
seg000:04AF 80 3D 5C                                cmp     byte ptr [di], 5Ch ; '\'
seg000:04B2 75 FA                                   jnz     short loc_104AE
seg000:04B4
seg000:04B4                         loc_104B4:                              ; CODE XREF: maybe_lib_sub_103FC+9Aj
seg000:04B4                                                                 ; maybe_lib_sub_103FC+A1j ...
seg000:04B4 0A C0                                   or      al, al
seg000:04B6 75 D2                                   jnz     short loc_1048A
seg000:04B8 80 7D FF 3A                             cmp     byte ptr [di-1], 3Ah ; ':'
seg000:04BC 75 03                                   jnz     short loc_104C1
seg000:04BE B0 5C                                   mov     al, 5Ch ; '\'
seg000:04C0 AA                                      stosb
seg000:04C1
seg000:04C1                         loc_104C1:                              ; CODE XREF: maybe_lib_sub_103FC+C0j
seg000:04C1 32 C0                                   xor     al, al
seg000:04C3 AA                                      stosb
seg000:04C4 5F                                      pop     di
seg000:04C5 5E                                      pop     si
seg000:04C6 1F                                      pop     ds
seg000:04C7 5D                                      pop     bp
seg000:04C8 CA 04 00                                retf    4
seg000:04C8                         _maybe_lib_sub_103FC endp

the function seems to be coded original in C due to the bp/sp usage - but could it be that it was coded in assembler and someone implemented the retf wrong?


Solution

  • This function was most probably hand-written in Assembler. I doubt contemporary compilers could've generated such a code. It does seem retf 4 is a human error, and it should've been retf 8 for callee to clean up its arguments (Pascal calling convention).

    The caller gets disbalanced stack after calling this function. It went unnoticed because the caller almost doesn't depend on the sp value during execution and it resets the stack pointer from bp in its epilogue:

    seg001:0D5E                 pop     di
    seg001:0D5F                 pop     si
    seg001:0D60                 mov     sp, bp
    seg001:0D62                 pop     bp
    seg001:0D63                 retf
    

    So the only adverse effect from stack inbalance is di/si being destroyed even though they are callee-saved registers.

    Proper Borland C prototype for the function is (assuming retf 8 fix):

    extern "C" void far pascal _maybe_lib_sub_103FC(const char far *source, char far *destination);
    

    Note arguments are reversed with its cdecl counterpart because of Pascal calling convention.