I have, just for fun, started learning assembler for the x86 architecture, its something i always wanted to know more about. I am using Microsoft Macro Assembler v6.11 running under IBM PC DOS 2000 (VM) everything runs great, the OS, assembler, linker, etc. But when i try to run this program:
.model small
.stack 100h
.data
hello db 10,13,"Hello World$"
.code
main proc
lea dx, hello
mov ah, 9h
int 21h
main endp
end main
I get a bunch of weird characters and Hello, world at the end, but the process crashes, since its running under DOS, i have no choice but to ctrl-atl-delete and restart. I Would like to know what i am doing wrong? My guess would be that i am out of bounds somehow, but i clearly don't know enough yet, about this.
I do the following to assemble and link the program:
ml hello.asm
This produces no errors or warnings, I have also tried to assemble and link the program manually:
masm hello.asm
link hello.obj
This also gives no errors or warnings, but the program still doesn't work
The program have been tested under IBM PC DOS 2000, and Microsoft Windows 98 SE, the results are the same, except that its only the process running the program that crashes under Windows.
Here is the listing produced, if its of any use:
Microsoft (R) Macro Assembler Version 6.11 12/20/23 15:42:42
hello.asm Page 1 - 1
.model small
.stack 100h
0000 .data
0000 0A 0D 48 65 6C 6C hello db 10,13,"Hello World$"
6F 20 57 6F 72 6C
64 24
0000 .code
0000 main proc
0000 8D 16 0000 R lea dx, hello
0004 B4 09 mov ah, 9h
0006 CD 21 int 21h
0008 main endp
end main
Microsoft (R) Macro Assembler Version 6.11 12/20/23 15:42:42
hello.asm Symbols 2 - 1
Segments and Groups:
N a m e Size Length Align Combine Class
DGROUP . . . . . . . . . . . . . GROUP
_DATA . . . . . . . . . . . . . 16 Bit 000E Word Public 'DATA'
STACK . . . . . . . . . . . . . 16 Bit 0100 Para Stack 'STACK'
_TEXT . . . . . . . . . . . . . 16 Bit 0008 Word Public 'CODE'
Procedures, parameters and locals:
N a m e Type Value Attr
main . . . . . . . . . . . . . . P Near 0000 _TEXT Length= 0008 Public
Symbols:
N a m e Type Value Attr
@CodeSize . . . . . . . . . . . Number 0000h
@DataSize . . . . . . . . . . . Number 0000h
@Interface . . . . . . . . . . . Number 0000h
@Model . . . . . . . . . . . . . Number 0002h
@code . . . . . . . . . . . . . Text _TEXT
@data . . . . . . . . . . . . . Text DGROUP
@fardata? . . . . . . . . . . . Text FAR_BSS
@fardata . . . . . . . . . . . . Text FAR_DATA
@stack . . . . . . . . . . . . . Text DGROUP
hello . . . . . . . . . . . . . Byte 0000 _DATA
0 Warnings
0 Errors
Update
I found this:
TITLE Hello World
.model small
.stack 100h
.data
message BYTE "Hello World",0dh,0ah,0
.code
main PROC
mov ax,@data
mov ds,ax
mov ah,40h
mov bx,1
mov cx,SIZEOF message
mov dx,OFFSET message
int 21h
.exit
main ENDP
END main
This program, works fine, no garbled output, as per the comments, i was under the impression that the "$" was for termination of the string, but the program above seems to append 0dh,0ah,0 to the bytes defined for "message" i have tried to adapt this in my original hello world program, but the output is the same, maybe with a little more garbled output. I will try to compare the two programs.
I get a bunch of weird characters and Hello, world at the end
The DOS.PrintString function 09h expects a far pointer in the DS:DX register pair. When your .EXE executable starts, the DS segment register points at the PSP (Program Segment Prefix), but in this case you need to make it point at the .data
section where your hello message resides:
mov ax, @data
mov ds, ax
The 'bunch of weird characters' are in fact the textual representation of the PSP and the .code
section, finally followed by the (legible) text from the .data
section.
but the process crashes
Every program needs an exit to its caller (the parent process is most often the OS) and you did not provide one! In the second program (that you found), this is handled by the mention .exit
.
The preferred way to terminate a DOS program is via function 4Ch where you can supply an exitcode in the AL register. Use 0 for a normal termination.
mov ax, 4C00h ; DOS.TerminateWithExitcode
int 21h
Afterwards, the parent process can inspect this exitcode using function 4Dh.
mov ah, 4Dh ; DOS.GetExitcode
int 21h ; -> AH exitcode system, AL exitcode child
i was under the impression that the "$" was for termination of the string, but the program above seems to append 0dh,0ah,0 to the bytes defined for "message" i have tried to adapt this in my original hello world program, but the output is the same
The $-termination is exclusively used with the DOS.PrintString function 09h. That other program was not using function 09h, but instead uses the DOS.WriteFileOrDevice function 40h with the predefined handle 1 for STDOUT. Because its operation is based on the count of bytes, having message zero-terminated is of no importance. It will display a space character however and that could sometimes mess things up a little.
13,10
or not to 13,10
Newlining in DOS needs both the carriage return (13) and the linefeed (10). In the real DOS environment the order doesn't matter a bit, and nearly everybody uses (13, 10), but some emulator might not like one or the other. I believe it was emu8086 that isn't particularly fond of (13, 10).