I'm trying to get better understanding of how assembly and machine code works. So I'm compiling this simple snipet with gcc :
#include <stdio.h>
int main(){
printf("Hello World!");
return 0;
}
But this includes the default library. I would like to output hello world without using printf but by inlining some assembly in the C file, and adding -nostdlib and -nodefaultlibs options to gcc. How can I do that ? I'm using Windows 10 and mingw-w64 with Intel core i7 6700 HQ (laptop processor). Can I use NASM with gcc on windows ?
I recommend against using GCC's inline assembly. It is hard to get right. You ask the question Can I use NASM with GCC on windows?. The answer is YES, please do! You can link your 64-bit NASM code to a Win64 object and then link it with your C program.
You have to have knowledge of the Win64 API. Unlike Linux you aren't suppose to make system calls directly. You call the Windows API which is a thin wrapper around the system call interface.
For the purposes of writing to the console using the Console API you need to use a function like GetStdHandle
to get a handle to STDOUT and then call a function like WriteConsoleA
to write an ANSI string to the console.
When writing assembly code you have to have knowledge of the calling convention. Win64 calling convention is documented by Microsoft. It is also described in this Wiki article. A summary from the Microsoft documentation:
Calling convention defaults
The x64 Application Binary Interface (ABI) uses a four-register fast-call calling convention by default. Space is allocated on the call stack as a shadow store for callees to save those registers. There's a strict one-to-one correspondence between the arguments to a function call and the registers used for those arguments. Any argument that doesn’t fit in 8 bytes, or isn't 1, 2, 4, or 8 bytes, must be passed by reference. A single argument is never spread across multiple registers. The x87 register stack is unused, and may be used by the callee, but must be considered volatile across function calls. All floating point operations are done using the 16 XMM registers. Integer arguments are passed in registers RCX, RDX, R8, and R9. Floating point arguments are passed in XMM0L, XMM1L, XMM2L, and XMM3L. 16-byte arguments are passed by reference. Parameter passing is described in detail in Parameter Passing. In addition to these registers, RAX, R10, R11, XMM4, and XMM5 are considered volatile. All other registers are non-volatile.
My note: the shadow store is 32 bytes that have to be allocated on the stack after any stack arguments before a C or Win64 API function call is made.
This is a NASM program that calls a function WriteString
function that takes a string to print as the first parameter and the length of the string for the second. WinMain
is the default entry point for Windows console programs:
global WinMain ; Make the default console entry point globally visible
global WriteString ; Make function WriteString globally visible
default rel ; Default to RIP relative addressing rather
; than absolute
; External Win API functions available in kernel32
extern WriteConsoleA
extern GetStdHandle
extern ExitProcess
SHADOW_AREA_SIZE EQU 32
STD_OUTPUT_HANDLE EQU -11
; Read Only Data section
section .rdata use64
strBrownFox db "The quick brown fox jumps over the lazy dog!"
strBrownFox_len equ $-strBrownFox
; Data section (read/write)
section .data use64
; BSS section (read/write) zero-initialized
section .bss use64
numCharsWritten: resd 1 ; reserve space for one 4-byte dword
; Code section
section .text use64
; Default Windows entry point in 64-bit code
WinMain:
push rsp ; Align stack on 16-byte boundary. 8 bytes were
; pushed by the CALL that reached us. 8+8=16
lea rcx, [strBrownFox] ; Parameter 1 = address of string to print
mov edx, strBrownFox_len ; Parameter 2 = length of string to print
call WriteString
xor ecx, ecx ; Exit and return 0
call ExitProcess
WriteString:
push rbp
mov rbp, rsp ; Creating a stack frame is optional
push rdi ; Non volatile register we clobber that has to be saved
push rsi ; Non volatile register we clobber that has to be saved
sub rsp, 16+SHADOW_AREA_SIZE
; The number of bytes pushed must be a multiple of 8
; to maintain alignment. That includes RBP, the registers
; we save and restore, the maximum number of extra
; parameters needed by all the WinAPI calls we make
; And the Shadow Area Size. 8+8+8+16+32=72.
; 72 is multiple of 8 so at this point our stack
; is aligned on a 16 byte boundary. 8 bytes were pushed
; by the call to reach WriteString.
; 72+8=80 = 80 is evenly divisible by 16 so stack remains
; properly aligned after the SUB instruction
mov rdi, rcx ; Store string address to RDI (Parameter 1 = RCX)
mov esi, edx ; Store string length to RSI (Parameter 2 = RDX)
; HANDLE WINAPI GetStdHandle(
; _In_ DWORD nStdHandle
; );
mov ecx, STD_OUTPUT_HANDLE
call GetStdHandle
; BOOL WINAPI WriteConsole(
; _In_ HANDLE hConsoleOutput,
; _In_ const VOID *lpBuffer,
; _In_ DWORD nNumberOfCharsToWrite,
; _Out_ LPDWORD lpNumberOfCharsWritten,
; _Reserved_ LPVOID lpReserved
; );
mov ecx, eax ; RCX = File Handle for STDOUT.
; GetStdHandle returned handle in EAX
mov rdx, rdi ; RDX = address of string to display
mov r8d, esi ; R8D = length of string to display
lea r9, [numCharsWritten]
mov qword [rsp+SHADOW_AREA_SIZE+0], 0
; 5th parameter passed on the stack above
; the 32 byte shadow space. Reserved needs to be 0
call WriteConsoleA
pop rsi ; Restore the non volatile registers we clobbered
pop rdi
mov rsp, rbp
pop rbp
ret
You can assemble, and link with these commands:
nasm -f win64 myprog.asm -o myprog.obj
gcc -nostartfiles -nostdlib -nodefaultlibs myprog.obj -lkernel32 -lgcc -o myprog.exe
When you run myprog.exe
it should display:
The quick brown fox jumps over the lazy dog!
You can also compile C files into object files and link them to this code and call them from assembly as well. In this example GCC is simply being used as a linker.
This example is similar to the first one except we create a C file called cfuncs.c
that calls our assembly language WriteString
function to print Hello, world!:
cfuncs.c
/* WriteString is the assembly language function to write to console*/
extern void WriteString (const char *str, int len);
/* Implement strlen */
size_t strlen(const char *str)
{
const char *s = str;
for (; *s; ++s)
;
return (s-str);
}
void PrintHelloWorld(void)
{
char *strHelloWorld = "Hello, world!\n";
WriteString (strHelloWorld, strlen(strHelloWorld));
return;
}
myprog.asm
default rel ; Default to RIP relative addressing rather
; than absolute
global WinMain ; Make the default console entry point globally visible
global WriteString ; Make function WriteString globally visible
; Our own external C functions from our .c file
extern PrintHelloWorld
; External Win API functions in kernel32
extern WriteConsoleA
extern GetStdHandle
extern ExitProcess
SHADOW_AREA_SIZE EQU 32
STD_OUTPUT_HANDLE EQU -11
; Read Only Data section
section .rdata use64
strBrownFox db "The quick brown fox jumps over the lazy dog!", 13, 10
strBrownFox_len equ $-strBrownFox
; Data section (read/write)
section .data use64
; BSS section (read/write) zero-initialized
section .bss use64
numCharsWritten: resd 1 ; reserve space for one 4-byte dword
; Code section
section .text use64
; Default Windows entry point in 64-bit code
WinMain:
push rsp ; Align stack on 16-byte boundary. 8 bytes were
; pushed by the CALL that reached us. 8+8=16
lea rcx, [strBrownFox] ; Parameter 1 = address of string to print
mov edx, strBrownFox_len ; Parameter 2 = length of string to print
call WriteString
call PrintHelloWorld ; Call C function that prints Hello, world!
xor ecx, ecx ; Exit and return 0
call ExitProcess
WriteString:
push rbp
mov rbp, rsp ; Creating a stack frame is optional
push rdi ; Non volatile register we clobber that has to be saved
push rsi ; Non volatile register we clobber that has to be saved
sub rsp, 16+SHADOW_AREA_SIZE
; The number of bytes pushed must be a multiple of 8
; to maintain alignment. That includes RBP, the registers
; we save and restore, the maximum number of extra
; parameters needed by all the WinAPI calls we make
; And the Shadow Area Size. 8+8+8+16+32=72.
; 72 is multiple of 8 so at this point our stack
; is aligned on a 16 byte boundary. 8 bytes were pushed
; by the call to reach WriteString.
; 72+8=80 = 80 is evenly divisible by 16 so stack remains
; properly aligned after the SUB instruction
mov rdi, rcx ; Store string address to RDI (Parameter 1 = RCX)
mov esi, edx ; Store string length to RSI (Parameter 2 = RDX)
; HANDLE WINAPI GetStdHandle(
; _In_ DWORD nStdHandle
; );
mov ecx, STD_OUTPUT_HANDLE
call GetStdHandle
; BOOL WINAPI WriteConsole(
; _In_ HANDLE hConsoleOutput,
; _In_ const VOID *lpBuffer,
; _In_ DWORD nNumberOfCharsToWrite,
; _Out_ LPDWORD lpNumberOfCharsWritten,
; _Reserved_ LPVOID lpReserved
; );
mov ecx, eax ; RCX = File Handle for STDOUT.
; GetStdHandle returned handle in EAX
mov rdx, rdi ; RDX = address of string to display
mov r8d, esi ; R8D = length of string to display
lea r9, [numCharsWritten]
mov qword [rsp+SHADOW_AREA_SIZE+0], 0
; 5th parameter passed on the stack above
; the 32 byte shadow space. Reserved needs to be 0
call WriteConsoleA
pop rsi ; Restore the non volatile registers we clobbered
pop rdi
mov rsp, rbp
pop rbp
ret
To assemble, compile, and link to an executable you can use these commands:
nasm -f win64 myprog.asm -o myprog.obj
gcc -c cfuncs.c -o cfuncs.obj
gcc -nodefaultlibs -nostdlib -nostartfiles myprog.obj cfuncs.obj -lkernel32 -lgcc -o myprog.exe
The output of myprog.exe
should be:
The quick brown fox jumps over the lazy dog! Hello, world!