I am learning to program a system core of i386 by watching some videos. I've known some procedures about entering protected mode:
In a .code16
file, first I need to open A20 Address Line
and changed CR0
register, and then I need to ljmp
into a .code32
code.
Now I am wandering the differences between .code16
machine code and .code32
machine code
These are my questions:
.code16
code in protect mode?.code16
machine code and .code32
machine code genreated by assembler.code16
code after I set CR0
register and before ljmp, that's why?.code16
means "generate code specified to 16-bit mode" and .code32
means "generage code specified for 32-bit mode", so what does that mean?Sorry for my ignorance, I am a green hand in this field
What's the difference between .code16 machine code and .code32 machine code generated by assembler?
In 16-bit modes (real mode and 16-bit protected mode) and in 32-bit protected mode, the CPU interprets the bytes of the code differently.
The main difference is that the meanings of the instruction prefixes 66
and 67
(hexadecimal) are reversed:
In 16-bit modes, the CPU uses 16-bit registers and constants and i8086-type addressing modes by default. The prefix 66
tells the CPU to use 32-bit registers and constants; the prefix 67
tells the CPU to use i80386-type addressing modes:
Program bytes Instruction understood by the CPU
8b 08 mov cx,[bx+si]
66 8b 08 mov ecx,[bx+si]
67 8b 08 mov cx,[eax]
66 67 8b 08 mov ecx,[eax]
In 32-bit protected mode, it is the other way round:
8b 08 mov ecx,[eax]
...
66 67 8b 08 mov cx,[bx+si]
"generate code specified to 16/32-bit mode" ... so what does that mean?
If one line of your program is mov ecx,[eax]
, the assembler writes 8b 08
in .code32
mode and 66 67 8b 08
in .code16
mode.
... because the CPU interprets 8b 08
as mov ecx,[eax]
when operating in 32-bit mode and it interprets 66 67 8b 08
as mov ecx,[eax]
when operating in 16-bit mode.
Is it valid to use .code16 code in protect mode?
I have already written about the "16-bit protected mode".
Actually, there exists no "16-bit protected mode" but only one single "protected mode". In protected mode, you can create 16- and 32-bit descriptors in the GDT (or LDT).
To execute 16-bit code in protected mode, you must create a 16-bit code descriptor (in the GDT or the LDT) and perform an ljmp
to that code.
(Executing 16-bit code in protected mode is required to switch a 32-bit CPU from protected mode back to real mode.)
Note that the descriptors for 16-bit code (and the stack!) must only have a size of 64 KiB and less. This means that you cannot create one single descriptor describing the whole 4 GiB of memory (as it is done for 32-bit code) but it might be necessary to create multiple descriptors for code that is located in different parts of the memory.
I found it is valid to execute .code16 code after I set CR0 register and before ljmp, that's why?
Internally, the segment registers (cs
, ds
...) seem to be about 80 bits long but only 16 of these 80 bits are visible to the programmer.
One of the "hidden" bits of the cs
register specifies if the CPU executes 16- or 32-bit code. (In protected mode, this bit is read from the GDT or LDT.)
According to some information I have read when reading about the so-called "unreal mode", the main difference between "real mode" and "protected mode" inside the i80386 CPU seems to be that the "hidden" bits of the segment registers are modified differently in the two modes when changing the value of a segment register. (There are also differences in interrupt handling etc. ...)
If this is true, setting or clearing bit 0 of CR0
has (nearly) no effect at all until a segment register is changed (by performing ljmp
, mov ds,ax
... or an interrupt).