My precompiled x86 code may be running in a 16-bit (real mode or 16-bit protected mode) or a 32-bit (i386 protected mode). How do I detect it from the code at runtime?
I was able to come up with this NASM source:
bits 16
cpu 386
pushf
test ax, strict word 0 ; In 32-bit mode this is `test eax, ...', +2 bytes.
jmp short found_16
; Fall through to found_32.
found_32:
bits 32
popf
int 32 ; Or whatever code.
found_16:
bits 16
popf
int 16 ; Or whatever code.
However, I don't like it, because it uses the stack. Is there a solution which doesn't modify any general-purpose registers, segment registers or flags, doesn't use the stack, and works on a 8086 (16-bit mode only) and on a 386 (both modes)?
I've tried lea esi, [dword esi+0]
in 32-bit mode, but that translates to a non-nop in 16-bit mode.
Please note that I'm aware that for most programs the mode is decided at compile time (as part of the architecture and platform), and they don't have to be able the detect the mode at runtime. Also for programs started normally, the operating system will choose the correct mode based on the file header, thus there is almost no danger of accidentally running a full program file in the wrong mode. However, some program snippets such as as exploit shellcode can benefit from runtime detection of all kinds (including the architecture and the operating system). I also have some other obscure use cases in mind.
I realized I can improve on my previous solution.
JMP NEAR
, opcode 0xE9 takes a two-byte 16-bit immediate displacement in 16-bit mode, and a four-byte 32-bit displacement in 32-bit mode. Moreover, this displacement is relative to the start of the next instruction. So if the upper 16 bits of the 32-bit displacement are zero, this means that the jump target in 16-bit mode is two bytes below the jump target in 32-bit mode. That's just enough space for a short jump to the real 16-bit destination.
NASM example:
bits 16
jmp near found_16
dw 0x0
found_16:
bits 16
jmp short main_16 ; must be exactly 2 bytes
found_32:
bits 32
;; up to 127 total bytes of code can go here
;; jump elsewhere if you need more space
int 32
hlt
main_16:
bits 16
;; unlimited space here
int 16
hlt
Output of ndisasm -b16 foo.bin
:
00000000 E90200 jmp 0x5
00000003 0000 add [bx+si],al ; not executed
00000005 EB03 jmp short 0xa
00000007 CD20 int 0x20 ; not executed
00000009 F4 hlt ; not executed
0000000A CD10 int 0x10
0000000C F4 hlt
Output of ndisasm -b32 foo.bin
:
00000000 E902000000 jmp 0x7
00000005 EB03 jmp short 0xa ; not executed
00000007 CD20 int 0x20
00000009 F4 hlt
0000000A CD10 int 0x10 ; not executed
0000000C F4 hlt ; not executed
My previous solution, included for reference, was to use 0x0001
as the upper 16 bits of the displacement, so that in 32-bit mode, the jump target is 64K+2 bytes further along. This requires having at least 64K+ of code space available.
bits 16
jmp near do_16
next_insn_16:
dw 0x1
next_insn_32:
do_16:
int 16
;; The space between next_insn_32 and do_32
;; should equal 0x10000 + (do_16 - next_insn_16)
db (0x10000 + (do_16 - next_insn_16) - ($ - next_insn_32)) dup 0x90
do_32:
bits 32
int 32
Output of ndisasm -b16 foo.bin
:
00000000 E90200 jmp 0x5
00000003 0100 add [bx+si],ax
00000005 CD10 int 0x10
00000007 90 nop
; ...
Output of ndisasm -b32 foo.bin
:
00000000 E902000100 jmp 0x10007
00000005 CD10 int 0x10
00000007 90 nop
; ...
00010006 90 nop
00010007 CD20 int 0x20