I'm currently trying to learn MASM x64, and so far I seem to be getting the hang of things pretty well. Everything was going well right up until I tried to call CreateFileW to read the contents of a .txt file. The problematic code is as follows:
; Open the file for GENERIC_READ
CALL ClearRegisters
LEA RCX, TextTestfilePath
MOV RDX, 80000000h ; GENERIC_READ
MOV R8, 00000001h ; FILE_SHARE_READ
MOV R9, 0h ; NULL
SUB RSP, 40h
PUSH 0h ; NULL
PUSH 80h ; FILE_ATTRIBUTE_NORMAL
PUSH 3 ; OPEN_EXISTING
CALL CreateFileW
ADD RSP, 40h
CMP EAX, -1
JNE p_skip_invalid_create_file
CALL InternalError
p_skip_invalid_create_file::
This is a subset of the full program which can be found here.
When I run the program, I will type in my file ("test.txt") into the program (test.txt is located within the source files, which can also be found on the GitHub). TextTestfilePath is the stored value of that ReadConsoleW output (with the CRLF truncated off of the end). In memory, it reads as 0074 0065 0073 0074 002e 0074 0078 0074 0000 or ".t.e.s.t...e.x.e..", which to my understanding is valid Unicode.
When executing the code, CreateFileW returns -1 or INVALID_HANDLE_VALUE, and after the call to GetLastError is when I receive 0x57 or ERROR_INVALID_PARAMETER. I have tried calling SetLastError to set it to zero before the call and receive the same response.
After quite a bit of conversation with GPT-4, I still can't seem to find the source of the issue. I have verified the following:
- Each of the parameters is correct.
- The TextTestfilePath is correctly written to in memory (and succeeded in an earlier call to GetFileAttributesW)
- The RDX, R8, and the first stack (final push) parameters are correct to the best of my understanding.
- I believe I am correctly allocating the shadow space needed for the call to succeed.
I am still learning MASM x64 with the limited information there is about it out there, but I have a general understanding of how it all works, and I've read a few books on it and used a portion of the Win32 Console API up to this point.
But, every time I get this parameter error, I get to be at a complete loss. It's so vague that I don't know where to really check, and the things I do check all never seem to be the issue. So if anyone has any idea of more things I'd need to check (or heck, if you see the issue) (or heck heck, if you have any tips for me that I have yet to figure out), please let me know! :)
Before commenting that I am doing something wrong, please help not only me but the community find well-documented sources to learn MASM x64 that can explain that concept well! Just saying, "You're doing something wrong, and you need to fix it," neither helps resolve this issue nor contributes to a discussion that encourages learning and education, which would be expected from a site like StackOverflow. Links to third-party sources, in addition to the obvious Microsoft docs, are incredibly helpful for a big-picture overview of what is expected, instead of assuming certain things are known when they may not be.
I've figured it out. I was indeed pushing onto the stack wrong. But I had a fundamental misunderstanding of how the stack worked which the Microsoft docs did a horrible job of explaining.
What I Did Wrong
Attempt #1
As @RbMm pointed out in the comments, the arguments are expected to be on RSP+20h, RSP+28h, and RSP+30h respectively. In addition, there needs to be the shadow space on the stack for the function call. I was making a series of mistakes which caused this not to work.
Let's explain the way I did the code previously:
I was modifying the stack pointer to push the shadow space. This is correctly, and 20h is the correct value for this because it is 32 bytes of shadow space which translates to 20h in hexadecimal. This will keep everything 16-bit aligned.
I was pushing the arguments onto the stack. The problem is, I was doing this incorrectly (or backwards). The RSP, or stack pointer, references the top of the stack. When I PUSHed the values onto the stack, it would push the values higher onto the stack. To top this off, it would modify the stack pointer so that it is no longer 16-bit aligned. The stack pointer is expected to be at 20h or 40h respectively, and not modified via a PUSH call.
After having pushed, with the values in the wrong position and the pointer in the wrong spot, the call would fail entirely.
Attempt #2
So, I attempted to correct for these mistakes by doing the following. However, I made a fatal mistake again in this process:
There's two major mistakes here, and this one should be more obvious.
I was pushing the values onto the top of the stack. However, by doing this, it completely overrides our shadow space with the three arguments. Then I would move the stack pointer, taking it completely away from the arguments I just pushed.
In 20h, 28h, and 36h, I was doing math wrong. I was adding 8 in decimal (20+8=28, 28+8=36), however, I should've been adding 8 in hexadecimal (20h+8h=28h, but 28h+8h != 36h, but 30h).
The assembler does not handle [RSP+28h] correctly. Instead, it was important I specified the size of value I was moving and calling the pointer. Thus, I needed to add QWORD PTR before it. (Notably, I am on x64, so I used QWORD instead of DWORD, as almost all of the MASM examples out there try and say is correct).
Attempt #3
After I resolved these problems, my code resulted in the following:
This code does the following:
It moves the first four arguments into the registers, as before.
It moves the stack pointer (which, as explained before, it is top of the stack) 20h, which aligns it via 16 byte alignment for 32 bytes of shadow space. Important to note is that this, in and of itself, does not create the shadow space. While it does open 32 bytes of space, it's important we don't override the 32 bytes we just opened up. Your arguments do not go in this space.)
It puts the arguments in our newly modified stack pointer, but offsets them by 20h to avoid overriding the shadow space.
And yes, if you're seeing what I am seeing, this code is actually the same thing as doing this:
This is doing the exact same thing, but it puts the arguments onto the stack before allowing the shadow space.
I prefer the syntax of +20h to account for the shadow space, as it makes it more obvious for me that we are taking it into account. But what I want you to get out of this, is that the documentation for the stack is terrible.
Attempt #4
As @RaymondChen pointed out in the comments, I was not taking into account the epilog and prolog for my function. RSP should not be modified (among a few other registers, that is, RBX, RBP, RDI, RSI, RSP, and R12 through R15) inside the body of a function. If they are modified, they must be preserved and restored prior to and following the function's call, respectively. This is the purpose of the epilog and prolog, alongside debugging when an exception occurs.
The updated function call does essentially the same thing as before, but does not modify the stack pointer:
I've updated the "standard" below.
The x64 Stack Usage Standard (in better terms)
Here is the actual x64 stack usage standard that you need to follow when calling a Win32 function in MASM x64:
32 bytesor (20h) from the stack pointer, in addition to other local variables and stack arguments. An example is given below.RCX,RDX,R8, andR9forARG1,ARG2,ARG3, andARG4respectively.MOV QWORD PTR [RSP+20h], ARG5,MOV QWORD PTR [RSP+28h], ARG6,MOV QWORD PTR [RSP+30h], ARG7and so on).CALLyour Win32 method.An example of a proper Win32 function call is shown below:
This ensures that when you store your arguments (in RSP+20h), it is still within your epilog and prolog (which is RSP to RSP+40h of space).
You must also perform this epilog and prolog methodology for any functions you may develop or create. This avoids needing to allocate the 20h of stack space every function call, and correctly handles Win32 exception handling for the __fastcall convention so that it (and you) can 'walk the stack.'
Hopefully this helps someone understand this a little better.
I am not sure why the standards express things in terms of right to left, or front to back, or top to bottom, because this explanation is unintuitive and subjective depending on how you are viewing the stack. Using terms like ADD or SUBTRACT makes much more sense and is universal no matter the way the stack is being displayed.
I hope that this helps someone avoid the 6-7 hours of research and pain that I went through, and helps explain the stack much better! If anyone has any comments regarding my explanation as to things I may have overlooked or explained incorrectly, please let me know. However, so far this has worked for me 100% of the time.