delphibasm

Why eax gives zero if it contains self?


According to the "Using Assembler in Delphi", eax will contain Self. However, the content of eax is 0 as shown. I wonder what is wrong ?

procedure TForm1.FormCreate(Sender: TObject);
var
  X, Y: Pointer;
begin
  asm
    mov X, eax
    mov Y, edx
  end;
  ShowMessage(IntToStr(NativeInt(X)) + ' ; ' + IntToStr(NativeInt(Y)));
end;

Solution

  • The code generated when I compile this, under debug settings, is like so:

      begin
    005A9414 55               push ebp
    005A9415 8BEC             mov ebp,esp
    005A9417 83C4E4           add esp,-$1c
    005A941A 33C9             xor ecx,ecx
    005A941C 894DEC           mov [ebp-$14],ecx
    005A941F 894DE8           mov [ebp-$18],ecx
    005A9422 894DE4           mov [ebp-$1c],ecx
    005A9425 8955F0           mov [ebp-$10],edx
    005A9428 8945F4           mov [ebp-$0c],eax
    005A942B 33C0             xor eax,eax
    005A942D 55               push ebp
    005A942E 6890945A00       push $005a9490
    005A9433 64FF30           push dword ptr fs:[eax]
    005A9436 648920           mov fs:[eax],esp
      mov X, eax
    005A9439 8945FC           mov [ebp-$04],eax
      mov Y, edx
    005A943C 8955F8           mov [ebp-$08],edx
    

    When the code starts executing, eax is indeed the self pointer. But the compiler has chosen to save it away to ebp-$0c and then zeroise eax. That's really up to the compiler.

    The code under release settings is quite similar. The compiler still chooses to zeroise eax. Of course, you cannot rely on the compiler doing that.

      begin
    005A82A4 55               push ebp
    005A82A5 8BEC             mov ebp,esp
    005A82A7 33C9             xor ecx,ecx
    005A82A9 51               push ecx
    005A82AA 51               push ecx
    005A82AB 51               push ecx
    005A82AC 51               push ecx
    005A82AD 51               push ecx
    005A82AE 33C0             xor eax,eax
    005A82B0 55               push ebp
    005A82B1 6813835A00       push $005a8313
    005A82B6 64FF30           push dword ptr fs:[eax]
    005A82B9 648920           mov fs:[eax],esp
      mov X, eax
    005A82BC 8945FC           mov [ebp-$04],eax
      mov Y, edx
    005A82BF 8955F8           mov [ebp-$08],edx
    

    Remember that parameter passing defines the state of registers and stack when the function starts executing. What happens next, how the function decodes the parameters is down to the compiler. It is under no obligation to leave untouched the registers and stack that were used for parameter passing.

    If you inject asm into the middle of a function, you cannot expect the volatile registers like eax to have particular values. They will hold whatever the compiler happened to put in them most recently.

    If you want to examine the registers at the very beginning of the execution of the function, you need to use a pure asm function to be sure to avoid having the compiler modify the registers that were used for parameter passing:

    var
      X, Y: Pointer;
    asm
      mov X, eax
      mov Y, edx
      // .... do something with X and Y
    end;
    

    The compiler will make its choices very much dependent on the code in the rest of the function. For your code, the complexity of assembling the string to pass to ShowMessage causes quite a large preamble. Consider this code instead:

    type
      TForm1 = class(TForm)
        procedure FormCreate(Sender: TObject);
      private
        i: Integer;
        function Sum(j: Integer): Integer;
      end;
    ....
    procedure TForm1.FormCreate(Sender: TObject);
    begin
      i := 624;
      Caption := IntToStr(Sum(42));
    end;
    
    function TForm1.Sum(j: Integer): Integer;
    var
      X: Pointer;
    begin
      asm
        mov X, eax
      end;
      Result := TForm1(X).i + j;
    end;
    

    In this case the code is simple enough for the compiler to leave eax alone. The optimised release build code for Sum is:

      begin
    005A8298 55               push ebp
    005A8299 8BEC             mov ebp,esp
    005A829B 51               push ecx
      mov X, eax
    005A829C 8945FC           mov [ebp-$04],eax
      Result := TForm4(X).i + j;
    005A829F 8B45FC           mov eax,[ebp-$04]
    005A82A2 8B80A0030000     mov eax,[eax+$000003a0]
    005A82A8 03C2             add eax,edx
      end;
    005A82AA 59               pop ecx
    005A82AB 5D               pop ebp
    005A82AC C3               ret 
    

    And when you run the code, the form's caption is changed to the expected value.


    To be perfectly honest, inline assembly, placed as an asm block inside a Pascal function, is not very useful. The thing about writing assembly is that you need to fully understand the state of the registers and the stack. that is well defined at the beginning and end of a function, defined by the ABI.

    But in the middle of a function, that state depends entirely on the decisions made by the compiler. Injecting asm blocks in there requires you to know the decisions the compiler made. It also means that the compiler cannot understand the decisions that you made. This is usually impractical. Indeed for the x64 compiler Embarcadero banned such inline asm blocks. I personally have never used an inline asm block in my code. If ever I write asm I always write pure asm functions.