I'm currently trying to learn MASM x64, and so far I seem to be getting the hang of things pretty well. Everything was going well right up until I tried to call CreateFileW to read the contents of a .txt file. The problematic code is as follows:
; Open the file for GENERIC_READ
CALL ClearRegisters
LEA RCX, TextTestfilePath
MOV RDX, 80000000h ; GENERIC_READ
MOV R8, 00000001h ; FILE_SHARE_READ
MOV R9, 0h ; NULL
SUB RSP, 40h
PUSH 0h ; NULL
PUSH 80h ; FILE_ATTRIBUTE_NORMAL
PUSH 3 ; OPEN_EXISTING
CALL CreateFileW
ADD RSP, 40h
CMP EAX, -1
JNE p_skip_invalid_create_file
CALL InternalError
p_skip_invalid_create_file::
This is a subset of the full program which can be found here.
When I run the program, I will type in my file ("test.txt") into the program (test.txt is located within the source files, which can also be found on the GitHub). TextTestfilePath is the stored value of that ReadConsoleW output (with the CRLF truncated off of the end). In memory, it reads as 0074 0065 0073 0074 002e 0074 0078 0074 0000
or ".t.e.s.t...e.x.e..", which to my understanding is valid Unicode.
When executing the code, CreateFileW returns -1 or INVALID_HANDLE_VALUE, and after the call to GetLastError is when I receive 0x57 or ERROR_INVALID_PARAMETER. I have tried calling SetLastError to set it to zero before the call and receive the same response.
After quite a bit of conversation with GPT-4, I still can't seem to find the source of the issue. I have verified the following:
I am still learning MASM x64 with the limited information there is about it out there, but I have a general understanding of how it all works, and I've read a few books on it and used a portion of the Win32 Console API up to this point.
But, every time I get this parameter error, I get to be at a complete loss. It's so vague that I don't know where to really check, and the things I do check all never seem to be the issue. So if anyone has any idea of more things I'd need to check (or heck, if you see the issue) (or heck heck, if you have any tips for me that I have yet to figure out), please let me know! :)
Before commenting that I am doing something wrong, please help not only me but the community find well-documented sources to learn MASM x64 that can explain that concept well! Just saying, "You're doing something wrong, and you need to fix it," neither helps resolve this issue nor contributes to a discussion that encourages learning and education, which would be expected from a site like StackOverflow. Links to third-party sources, in addition to the obvious Microsoft docs, are incredibly helpful for a big-picture overview of what is expected, instead of assuming certain things are known when they may not be.
I've figured it out. I was indeed pushing onto the stack wrong. But I had a fundamental misunderstanding of how the stack worked which the Microsoft docs did a horrible job of explaining.
As @RbMm pointed out in the comments, the arguments are expected to be on RSP+20h, RSP+28h, and RSP+30h respectively. In addition, there needs to be the shadow space on the stack for the function call. I was making a series of mistakes which caused this not to work.
Let's explain the way I did the code previously:
LEA RCX, TextTestfilePath
MOV RDX, 80000000h ; GENERIC_READ
MOV R8, 00000001h ; FILE_SHARE_READ
MOV R9, 0h ; NULL
SUB RSP, 20h
PUSH 00h
PUSH 80h
PUSH 3
CALL CreateFileW
ADD RSP, 20h
I was modifying the stack pointer to push the shadow space. This is correctly, and 20h is the correct value for this because it is 32 bytes of shadow space which translates to 20h in hexadecimal. This will keep everything 16-bit aligned.
I was pushing the arguments onto the stack. The problem is, I was doing this incorrectly (or backwards). The RSP, or stack pointer, references the top of the stack. When I PUSHed the values onto the stack, it would push the values higher onto the stack. To top this off, it would modify the stack pointer so that it is no longer 16-bit aligned. The stack pointer is expected to be at 20h or 40h respectively, and not modified via a PUSH call.
After having pushed, with the values in the wrong position and the pointer in the wrong spot, the call would fail entirely.
So, I attempted to correct for these mistakes by doing the following. However, I made a fatal mistake again in this process:
LEA RCX, TextTestfilePath
MOV RDX, 80000000h ; GENERIC_READ
MOV R8, 00000001h ; FILE_SHARE_READ
MOV R9, 0h ; NULL
MOV [RSP + 20h], 3
MOV [RSP + 28h], 80h
MOV [RSP + 36h], 00h
SUB RSP, 20h
CALL CreateFileW
ADD RSP, 20h
There's two major mistakes here, and this one should be more obvious.
I was pushing the values onto the top of the stack. However, by doing this, it completely overrides our shadow space with the three arguments. Then I would move the stack pointer, taking it completely away from the arguments I just pushed.
In 20h, 28h, and 36h, I was doing math wrong. I was adding 8 in decimal (20+8=28, 28+8=36), however, I should've been adding 8 in hexadecimal (20h+8h=28h, but 28h+8h != 36h, but 30h).
The assembler does not handle [RSP+28h] correctly. Instead, it was important I specified the size of value I was moving and calling the pointer. Thus, I needed to add QWORD PTR before it. (Notably, I am on x64, so I used QWORD instead of DWORD, as almost all of the MASM examples out there try and say is correct).
After I resolved these problems, my code resulted in the following:
LEA RCX, TextTestfilePath
MOV RDX, 80000000h ; GENERIC_READ
MOV R8, 00000001h ; FILE_SHARE_READ
MOV R9, 0h ; NULL
SUB RSP, 20h
MOV QWORD PTR [RSP + 20h], 3
MOV QWORD PTR [RSP + 28h], 80h
MOV QWORD PTR [RSP + 30h], 00h
CALL CreateFileW
ADD RSP, 20h
This code does the following:
It moves the first four arguments into the registers, as before.
It moves the stack pointer (which, as explained before, it is top of the stack) 20h, which aligns it via 16 byte alignment for 32 bytes of shadow space. Important to note is that this, in and of itself, does not create the shadow space. While it does open 32 bytes of space, it's important we don't override the 32 bytes we just opened up. Your arguments do not go in this space.)
It puts the arguments in our newly modified stack pointer, but offsets them by 20h to avoid overriding the shadow space.
And yes, if you're seeing what I am seeing, this code is actually the same thing as doing this:
MOV QWORD PTR [RSP], 3
MOV QWORD PTR [RSP + 8h], 80h
MOV QWORD PTR [RSP + 10h], 00h
SUB RSP, 20h
This is doing the exact same thing, but it puts the arguments onto the stack before allowing the shadow space.
I prefer the syntax of +20h to account for the shadow space, as it makes it more obvious for me that we are taking it into account. But what I want you to get out of this, is that the documentation for the stack is terrible.
As @RaymondChen pointed out in the comments, I was not taking into account the epilog and prolog for my function. RSP should not be modified (among a few other registers, that is, RBX, RBP, RDI, RSI, RSP, and R12 through R15) inside the body of a function. If they are modified, they must be preserved and restored prior to and following the function's call, respectively. This is the purpose of the epilog and prolog, alongside debugging when an exception occurs.
The updated function call does essentially the same thing as before, but does not modify the stack pointer:
LEA RCX, TextTestfilePath
MOV RDX, 80000000h ; GENERIC_READ
MOV R8, 00000001h ; FILE_SHARE_READ
MOV R9, 0h ; NULL
MOV QWORD PTR [RSP + 20h], 3
MOV QWORD PTR [RSP + 28h], 80h
MOV QWORD PTR [RSP + 30h], 00h
CALL CreateFileW
I've updated the "standard" below.
Here is the actual x64 stack usage standard that you need to follow when calling a Win32 function in MASM x64:
32 bytes
or (20h
) from the stack pointer, in addition to other local variables and stack arguments. An example is given below.RCX
, RDX
, R8
, and R9
for ARG1
, ARG2
, ARG3
, and ARG4
respectively.MOV QWORD PTR [RSP+20h], ARG5
, MOV QWORD PTR [RSP+28h], ARG6
, MOV QWORD PTR [RSP+30h], ARG7
and so on).CALL
your Win32 method.An example of a proper Win32 function call is shown below:
INCLUDELIB kernel32.lib
.CODE
main PROC
LOCAL LocalVariable: QWORD
; Prolog
PUSH RBP ; Store the RBP to restore it after
MOV RBP, RSP ; Move the RSP into RBP for debugging
SUB RSP, 40h ; 20h of shadow space for function calls
; 8h for the one local QWORD variable
; 18h for 3 stack arguments
MOV RCX, ARG1 ; Put ARG1 into RCX
MOV RDX, ARG2 ; Put ARG2 into RDX
MOV R8, ARG3 ; Put ARG3 into R8
MOV R9, ARG4 ; Put ARG4 into R9
MOV QWORD PTR [RSP + 20h], ARG5 ; Put ARG5 into RSP+20h
MOV QWORD PTR [RSP + 28h], ARG6 ; Put ARG6 into RSP+28h
MOV QWORD PTR [RSP + 30h], ARG7 ; Put ARG7 into RSP+30h
CALL MyWin32Function
; Technically, you don't need a prolog if your next
; call is going to end the process. I provide it
; for an example.
; Epilog
ADD RSP, 40h ; Same value as epilog
MOV RSP, RBP ; Restore original stack pointer
POP RBP ; Restore original RBP
RET
main ENP
END
This ensures that when you store your arguments (in RSP+20h), it is still within your epilog and prolog (which is RSP to RSP+40h of space).
You must also perform this epilog and prolog methodology for any functions you may develop or create. This avoids needing to allocate the 20h of stack space every function call, and correctly handles Win32 exception handling for the __fastcall convention so that it (and you) can 'walk the stack.'
Hopefully this helps someone understand this a little better.
I am not sure why the standards express things in terms of right to left, or front to back, or top to bottom, because this explanation is unintuitive and subjective depending on how you are viewing the stack. Using terms like ADD or SUBTRACT makes much more sense and is universal no matter the way the stack is being displayed.
I hope that this helps someone avoid the 6-7 hours of research and pain that I went through, and helps explain the stack much better! If anyone has any comments regarding my explanation as to things I may have overlooked or explained incorrectly, please let me know. However, so far this has worked for me 100% of the time.