In a 64 bit program the selector:offset used to get the stack protector is fs:0x28, where fs=0. This poses no problem because in 64 bit we have the MSR fs_base (which is set to point to the TLS) and the GDT is completely ignored.
But with 32 bit program the stack protector is read from gs:0x14. Running over a 64 bit system we have gs=0x63, on a 32 bit system gs=0x33. Here there are no MSRs because they were introduced in x86_64, so the GDT plays an important role here.
Dissecting this values we get for both cases a RPL=3 (which was expected), the descriptor table selector indicates GDT (LDT is not used in linux) and the selector points to the entry with index 12 for 64 bits and index 6 for 32 bits.
Using a kernel module I was able to check that this entry in 64-bit linux is NULL! So I don't understand how the address of the TLS is resolved.
The relevant part of the kernel module is the following:
void gdtread()
{
struct desc_ptr gdtr;
seg_descriptor* gdt_entry = NULL;
uint16_t tr;
int i;
asm("str %0" : "=m"(tr));
native_store_gdt(&gdtr); // equiv. to asm("sgdt %0" : "=m"(gdtr));
printk("GDT address: 0x%px, GDT size: %d bytes = %i entries\n",
(void*)gdtr.address, gdtr.size + 1, (gdtr.size + 1) / 8);
gdt_entry = (seg_descriptor*)gdtr.address;
for(i = 0; i < (gdtr.size + 1) / 8; i++)
{
if(tr >> 3 == i)
printk("Entry #%i:\t<--- TSS (RPL = %i)", i, tr & 3);
else
printk("Entry #%i:", i);
if(!((uint64_t*)gdt_entry)[i])
{
printk("\tNULL");
continue;
}
if(gdt_entry[i].s)
user_segment_desc(&gdt_entry[i]);
else
system_segment_desc((sys_seg_descriptor*)&gdt_entry[i++]);
}
}
Which outputs the following on a 64-bit system:
[ 3817.191065] GDT address: 0xfffffe0000001000, GDT size: 128 bytes = 16 entries
[ 3817.191073] Entry #0:
[ 3817.191075] NULL
[ 3817.191078] Entry #1:
[ 3817.191081] Raw: 0x00cf9b000000ffff
[ 3817.191084] Base: 0x00000000
[ 3817.191088] Limit: 0xfffff
[ 3817.191091] Flags: 0xc09b
[ 3817.191096] Type = 0xb (Code, non conforming, readable, accessed)
[ 3817.191100] S = 0 (user)
[ 3817.191103] DPL = 0
[ 3817.191105] P = 1 (present)
[ 3817.191109] AVL = 0
[ 3817.191112] L = 0 (legacy mode)
[ 3817.191115] D/B = 1
[ 3817.191118] G = 1 (KiB)
[ 3817.191121] Entry #2:
[ 3817.191124] Raw: 0x00af9b000000ffff
[ 3817.191127] Base: 0x00000000
[ 3817.191130] Limit: 0xfffff
[ 3817.191133] Flags: 0xa09b
[ 3817.191137] Type = 0xb (Code, non conforming, readable, accessed)
[ 3817.191141] S = 0 (user)
[ 3817.191144] DPL = 0
[ 3817.191146] P = 1 (present)
[ 3817.191149] AVL = 0
[ 3817.191152] L = 1 (long mode)
[ 3817.191155] D/B = 0
[ 3817.191157] G = 1 (KiB)
[ 3817.191160] Entry #3:
[ 3817.191163] Raw: 0x00cf93000000ffff
[ 3817.191166] Base: 0x00000000
[ 3817.191169] Limit: 0xfffff
[ 3817.191171] Flags: 0xc093
[ 3817.191175] Type = 0x3 (Data, expand down, writable, accessed)
[ 3817.191178] S = 0 (user)
[ 3817.191181] DPL = 0
[ 3817.191183] P = 1 (present)
[ 3817.191186] AVL = 0
[ 3817.191189] L = 0
[ 3817.191191] D/B = 1
[ 3817.191194] G = 1 (KiB)
[ 3817.191197] Entry #4:
[ 3817.191199] Raw: 0x00cffb000000ffff
[ 3817.191202] Base: 0x00000000
[ 3817.191205] Limit: 0xfffff
[ 3817.191207] Flags: 0xc0fb
[ 3817.191211] Type = 0xb (Code, non conforming, readable, accessed)
[ 3817.191214] S = 0 (user)
[ 3817.191217] DPL = 3
[ 3817.191219] P = 1 (present)
[ 3817.191222] AVL = 0
[ 3817.191224] L = 0 (legacy mode)
[ 3817.191227] D/B = 1
[ 3817.191230] G = 1 (KiB)
[ 3817.191233] Entry #5:
[ 3817.191235] Raw: 0x00cff3000000ffff
[ 3817.191238] Base: 0x00000000
[ 3817.191241] Limit: 0xfffff
[ 3817.191243] Flags: 0xc0f3
[ 3817.191246] Type = 0x3 (Data, expand down, writable, accessed)
[ 3817.191250] S = 0 (user)
[ 3817.191252] DPL = 3
[ 3817.191255] P = 1 (present)
[ 3817.191258] AVL = 0
[ 3817.191260] L = 0
[ 3817.191262] D/B = 1
[ 3817.191265] G = 1 (KiB)
[ 3817.191268] Entry #6:
[ 3817.191270] Raw: 0x00affb000000ffff
[ 3817.191273] Base: 0x00000000
[ 3817.191276] Limit: 0xfffff
[ 3817.191278] Flags: 0xa0fb
[ 3817.191281] Type = 0xb (Code, non conforming, readable, accessed)
[ 3817.191284] S = 0 (user)
[ 3817.191287] DPL = 3
[ 3817.191289] P = 1 (present)
[ 3817.191292] AVL = 0
[ 3817.191295] L = 1 (long mode)
[ 3817.191298] D/B = 0
[ 3817.191300] G = 1 (KiB)
[ 3817.191303] Entry #7:
[ 3817.191306] NULL
[ 3817.191308] Entry #8: <--- TSS (RPL = 0)
[ 3817.191312] Raw: 0x00000000fffffe0000008b0030004087
[ 3817.191316] Base: 0xfffffe0000003000
[ 3817.191321] Limit: 0x04087
[ 3817.191324] Flags: 0x008b
[ 3817.191327] Type = 0xb (Busy 64-bit TSS)
[ 3817.191331] S = 1 (system)
[ 3817.191333] DPL = 0
[ 3817.191336] P = 1 (present)
[ 3817.191339] AVL = 0
[ 3817.191341] L = 0
[ 3817.191344] D/B = 0
[ 3817.191347] G = 0 (B)
[ 3817.191349] Entry #10:
[ 3817.191352] NULL
[ 3817.191355] Entry #11:
[ 3817.191358] NULL
[ 3817.191360] Entry #12:
[ 3817.191362] NULL
[ 3817.191365] Entry #13:
[ 3817.191367] NULL
[ 3817.191369] Entry #14:
[ 3817.191372] NULL
[ 3817.191374] Entry #15:
[ 3817.191377] Raw: 0x0040f50000000000
[ 3817.191380] Base: 0x00000000
[ 3817.191382] Limit: 0x00000
[ 3817.191385] Flags: 0x40f5
[ 3817.191389] Type = 0x5 (Data, expand up, read only, accessed)
[ 3817.191392] S = 0 (user)
[ 3817.191395] DPL = 3
[ 3817.191397] P = 1 (present)
[ 3817.191400] AVL = 0
[ 3817.191403] L = 0
[ 3817.191405] D/B = 1
[ 3817.191408] G = 0 (B)
I haven't tried this module on a 32 bit system yet, but I'm on my way.
So, to make the question clear: how does the gs segment selector work in a 32-bit program running on a 64-bit linux kernel?
After the comment of @PeterCordes I searched in the "AMD64 Architecture Programmer's Manual, vol. 2", where in page 27 says:
Compatibility mode ignores the high 32 bits of base address in the FS and GS segment descriptors when calculating an effective address.
This implies that a 64-bit kernel managing a 32-bit process uses the MSR_*S_BASE
registers as it would for a 64-bit process. The kernel can set the segment bases normally while in 64-bit long mode, so it doesn't matter whether or not those MSRs are available in 32-bit compatibility sub-mode of long mode, or in pure 32-bit protected mode (legacy mode, 32-bit kernel). A 64-bit Linux kernel only uses compat mode for ring 3 (user-space) where wrmsr
and rdmsr
aren't available because of privileges. As always, segment-base settings persist across changes to privilege level, like returning to user-space with sysret
or iret
.
Another thing that made me think that this registers weren't used for compatibility-mode processes was GDB. This is what happens when trying to print this register while debugging a 32-bit program.:
(gdb) i r $gs_base
Invalid register `gs_base'
Debugging a 64-bit program it works fine.
(gdb) i r $fs_base
fs_base 0x7ffff7d00c00 0x7ffff7d00c00
Since the instruction rdgsbase
is a 64-bit instruction (trying to execute that opcode in a program 32-bit yields a SIGILL signal), it is a bit tricky to obtain the value of this registers within a 32-bit program.
The first solution I thought was to read it from a kernel module:
unsigned long gs_base = 0xdeadbeefc0ffee13;
asm("swapgs;"
"rdgsbase %0;" // maybe unsafe if an interrupt happens here
// be careful if using this for anything more than toy experiments.
"swapgs;"
: "=r"(gs_base));
printk("gs_base: 0x%016lx", gs_base);
So I created a driver for a device in /dev
, so when a program open()
s that file the code above is executed. After compiling and running a 32-bit program that opens this file I got this
[10793.682033] gs_base: 0x00000000f7f9f040
And using gdb to inspect 0xf7f9f040+0x14
I saw the canary, meaning that it was the TLS.
(gdb) x/wx 0xf7f9f040+0x14
0xf7f9f054: 0x21f03c00
(gdb) x/wx $ebp-0xc
0xbffff60c: 0x21f03c00
The other way I can think of is to perform a far call to change from 32-bit to 64-bit, execute rdgsbase and then return to 64-bit. Probably this is a better solution since it doesn't need a kernel module. (As long as you can assume you're running on a CPU that supports the FSGSBASE extension, and a new enough kernel to enable it.)
Something like this:
#include <stdio.h>
__attribute__((naked)) // or define the function in an asm statement at global scope
extern void rdgsbase()
{
asm("rdgsbase %eax; retf");
}
int main()
{
unsigned int* gs_base = NULL;
unsigned int canary;
// would be unsafe in a leaf function: clobbers the red zone
asm("lcall $0x33, $rdgsbase; mov %%eax, %0" : "=m"(gs_base) : : "eax");
asm("mov %%gs:0x14, %%eax ; mov %%eax, %0" : "=m"(canary) : : "eax");
printf("gs_base = %p\n", gs_base);
printf("canary: 0x%08x\n", canary);
printf("canary: 0x%08x\n", gs_base[5]);
}
I know it is very very dirty and ugly, but it works.
$ gcc gs_base.c -o gs_base -m32
/usr/bin/ld: /tmp/ccAPoxwj.o: warning: relocation against `rdgsbase' in read-only section `.text'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
$ ./gs_base
gs_base = 0xf7f80040
canary: 0x59511d00
canary: 0x59511d00
In a 32-bit system the gs
segment selector has the value 0x33, this points to the 7th entry in the GDT (index 6). So let's see what is in there.
Using the same module I shown in the OP (with only minor modifications) I printed the GDT used during the execution of a specific process. This is the entry with index 6:
[ 3579.535005] Entry #6:
[ 3579.535007] Raw: 0xd100ffff
[ 3579.535009] Base: 0xb7fcd100
[ 3579.535011] Limit: 0xfffff
[ 3579.535013] Flags: 0xd0f3
[ 3579.535018] Type = 0x3 (Data, expand down, writable, accessed)
[ 3579.535019] S = 0 (user)
[ 3579.535021] DPL = 3
[ 3579.535023] P = 1 (present)
[ 3579.535025] AVL = 1
[ 3579.535027] L = 0
[ 3579.535028] D/B = 1
[ 3579.535030] G = 1 (KiB)
In gdb we can verify that it coincides with the TLS of said process:
(gdb) x/wx $ebp-0xc
0xbffff60c: 0xa6e29800
(gdb) x/wx 0xb7fcd100+0x14
0xb7fcd114: 0xa6e29800
Using strace
we can see how the 32-bit glibc sets the gs on a 64-bit system:
set_thread_area({entry_number=-1, base_addr=0xf7ebb040, limit=0x0fffff, seg_32bit=1, contents=0, read_exec_only=0, limit_in_pages=1, seg_not_present=0, useable=1}) = 0 (entry_number=12)
This syscall performs in the kernel the setup of the MSR_GS_BASE with the value specified in the argument base_addr
. The kernel also places the value 0x63 in the gs register, which points to the entry with index 12, a NULL entry.
On a 32-bit system the syscall is exactly the same
set_thread_area({entry_number=-1, base_addr=0xb7f66100, limit=0x0fffff, seg_32bit=1, contents=0, read_exec_only=0, limit_in_pages=1, seg_not_present=0, useable=1}) = 0 (entry_number=6)
But here, on a 32-bit kernel (which doesn't know anything about MSR_GS_BASE) the gs register gets the value 0x33, pointing to the index 6 in the GDT. Since there is no MSR_GS_BASE now is the GDT entry the one that is setup, with base address and limit fields (and rest of fields) equal to the ones specified in the arguments.
On the other hand, the 64-bit glibc uses the syscall arch_prctl(ARCH_SET_FS, 0x...)
to set the value of MSR_FS_BASE. This syscall is only available for 64-bit programs.
The only thing that I don't quite understand yet is why set gs=0x63 instead of 0 or 0x2b (the value of ss, ds and es)...