assemblyx86masmmemory-address

Understanding the behavior of $ Location Counter - var1 DWORD $ assembles to start of data section in MASM, not start of line


Program Code - 1

.386
.model flat

.data
    Array1 DWORD 1,2,3,4,5
    var1  DWORD $

.code
start PROC
    MOV EAX, OFFSET Array1
    MOV EBX, var1
    LEAVE
    RET
start ENDP
END

Program Code - 1 Output

EAX = 001E4000
EBX = 001E4000

The answer to why var1 is the base address of the .data segment was already answered here.

Why this behavior?

The location counter is set to zero offset when it encounters a segment like .code, .data, .bss, .const, etc.

The offset value of the location counter only increases when the assembler encounters any instruction or pseudo-opcode that emits object code.

For example, MASM increments the location counter for each byte written to the object code file. MASM increments the location counter by two after encountering MOV AX, BX because this instruction is two bytes long.

Now, take this example program:

Program Code - 2

.386
.model flat

.data
    Array1 DWORD 1,2,3,4,5
    var1  DWORD ($ - Array1)

.code
start PROC
    MOV EAX, OFFSET Array1
    MOV EBX, var1
    LEAVE
    RET
start ENDP
END

Program Code - 2 Output

EAX = 000B4000
EBX = 00000014

In Program Code - 2, I noticed a change in the output. I think it's because of address calculations with the location counter. I also tested it with some test cases, and it produced the same result.

In Program Code - 2, the value of the location counter for var1 changed, possibly because I performed address calculations ($ - Array1). This might be a reason for the observed behavior, but I’m not entirely sure.

According to the author's answer in the previous question, the location pointer $ is only incremented when the assembler encounters some instruction that emits object code.

Now the question is: what is happening in Program Code - 2? It seems like a variable declaration, so the location counter shouldn’t increment. But when looking at the result, it’s not like that. It seems to produce object code, though I’m not entirely sure.

Now, see Program Code - 3:

Program Code - 3

.386
.model flat

.data
    Array1 DWORD 1,2,3,4,5
    var1  DWORD ($ - Array1)
    var2  DWORD $

.code
start PROC
    MOV EAX, OFFSET Array1
    MOV EBX, var1
    MOV ECX, var2
    LEAVE
    RET
start ENDP
END

Program Code - 3 Output

EAX = 00DE4000
EBX = 00000014
ECX = 00DE4000

In Program Code - 2, var1 seems to have an offset pointing to the end of Array1. As I mentioned, I don’t know the exact reason behind this behavior.

However, Program Code - 3 makes it even harder to predict what’s happening. For var1, the location counter is incremented and then seems to be decremented or set back to the base address of Array1.

Why this behavior?

Please feel free to correct me. I’m not sure how to use it exactly after these test cases. Provide some good guidance on working with this $ location counter.


Solution

  • It seems that there is a bug with

    var2    DWORD $
    

    I get the same bug with 64 bit MASM (ML64.EXE):

    var2    QWORD $
    

    Microsoft documentation states that the symbol $ is "the current value of the location counter", without any exceptions.

    https://learn.microsoft.com/en-us/cpp/assembler/masm/dollar?view=msvc-170

    Making this change (DWORD -> EQU) gives expected results.

    var2    EQU   $
    

    ECX = 00DE4018
    

    Note that

            MOV     ECX,var2
    

    is now an immediate (constant) load.

    One way to have DWORD work properly is to use an equate.

    ofs2    EQU   $
    var2    DWORD ofs2
    

    or as suggested by Nate Eldredge:

    var2    DWORD var2