windowsnasmdisassemblyhexdumpdumpbin

Why are the disassembly output by NASM and dumpbin.exe different for the same executable file?


Here are the steps I followed.

1) I took the assembly language code for three different small programs from the book "Assembly Language for x86 Processors" by Kip Irvine.

2) I assembled, linked to produce a valid executable without errors in each case.

3) For each of the executable files, I generated disassembly using NASM

ndisasm -u -p intel add3.exe > add3_ndisasm.txt

4) In each case, I got the disassembly output using dumpbin.exe as well

dumpbin /disasm add3.exe > add3_dumpbin_disasm.txt

Surprisingly, the disassembly I got in step 4 is totally different from that of step 3.

Here is the assembly code I used (in one of the 3 cases).

; This program adds and subtracts 32-bit integers.
.386
.model flat,stdcall
.stack 4096
ExitProcess PROTO, dwExitCode:DWORD
DumpRegs PROTO
.code
main PROC
mov eax,10000h ; EAX = 10000h
add eax,40000h ; EAX = 50000h
sub eax,20000h ; EAX = 30000h
call DumpRegs
INVOKE ExitProcess,0
main ENDP
END main

Here is a sample of disassembly from step 3 ( NDISASM)

00000000  4D                dec ebp

00000001  5A                pop edx

00000002  90                nop

00000003  0003              add [ebx],al

00000005  0000              add [eax],al

00000007  000400            add [eax+eax],al

0000000A  0000              add [eax],al

0000000C  FF                db 0xff

0000000D  FF00              inc dword [eax]

and this is from step 4 (dumpbin.exe)

Microsoft (R) COFF/PE Dumper Version 14.11.25508.2
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file add3.exe

File Type: EXECUTABLE IMAGE

  00401000: 50                 push        eax

  00401001: E8 EF 0F 00 00     call        00401FF5

  00401006: C3                 ret

  00401007: 55                 push        ebp

  00401008: 8B EC              mov         ebp,esp

  0040100A: 83 C4 E8           add         esp,0FFFFFFE8h

  0040100D: 60                 pushad

  0040100E: 80 3D 00 40 40 00  cmp         byte ptr ds:[00404000h],0
            00

  00401015: 75 05              jne         0040101C

I took a few instruction code(s) from the output of step 3 and tried to search for them in the disassembly listing of step 4, but could not find them.

5) I then took a hex dump of the executable (using frhed) and compared the byte values in it with the outputs in both steps.

0000  4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 b8 00 00 00 00 00 00 00 40 00 00  MZ..........ÿÿ..¸.......@..

001b  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ...........................

0036  00 00 00 00 00 00 d8 00 00 00 0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68 73  ......Ø.....º..´.Í!¸.LÍ!Ths

0051  69 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f 74 20 62 65 20 72 75 6e 20 69 6e 20  i program cannot be run in 

006c  44 4f 53 20 6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00 5b 39 0b f3 1f 58 65  DOS mode....$.......[9.ó.Xe

The byte values I see in step 5 match those in step 3, but not step 4.

What explains these differences? I must be missing some simple little detail somewhere, what is it?


Solution

  • Short answer: .exe.com

    Hint: notice the MZ signature as two first bytes in the output of step 5 :-P

    Long answer:

    Microsoft's executable .exe format has more than just code. First of all it starts with a special signature (initials of the format's creator) followed by quite a bit of information that describes the organization of the code.

    In contrast a .com file is just a code, meaning the very first byte of it is what gets executed once the file is loaded into memory.

    The first disassembly you get is a wrong one (yes, the first one is wrong, not the second!) as it tries to start the parsing with the first byte instead of jumping on to the actual code.

    dumpbin is intelligent enough to properly parse the header of that .exe file and begins the disassembly of the actual code.

    Solution

    If you'd like to compare the disassembly output you either have to make sure that your NASM is aware of the type of file and properly parses its header or... simplify your life and convert the .exe into a .com in which case both disassembling operations should produce the same output (barring potential bugs, of course)

    The last time I was converting an .exe file into a .com was many years ago with a utility called exe2bin. A quick search online shows that this was during the days of Windows XP and is no longer shipped with the OS. Though I see no reason for it to not work if you download it from some place.