My question is based on this reddit post. The example there shows how to change an integer in memory using cast
function from the ctypes
module:
>>> import ctypes
>>> ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[3] = 100
>>> 29
100
I'm interested in the low level internals here and I've checked this in GDB session by setting a breakpoint on the cast
function in CPython
:
(gdb) break cast
Function "cast" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (cast) pending.
(gdb) run test.py
Starting program: /root/.pyenv/versions/3.8.0-debug/bin/python test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x7ffff00e7b40
Breakpoint 1, cast (ptr=0x9e6e40 <small_ints+1088>, src=10382912, ctype=<_ctypes.PyCPointerType at remote 0xa812a0>) at /root/.pyenv/sources/3.8.0-debug/Python-3.8.0/Modules/_ctypes/_ctypes.c:5540
5540 if (0 == cast_check_pointertype(ctype))
(gdb) p *(PyLongObject *) ptr
$38 = {
ob_base = {
ob_base = {
ob_refcnt = 12,
ob_type = 0x9b8060 <PyLong_Type>
},
ob_size = 1
},
ob_digit = {100}
}
(gdb) p *((long *) ptr + 3)
$39 = 100
(gdb) p ((long *) ptr + 3)
$40 = (long *) 0x9e6e58 <small_ints+1112>
(gdb) p *((char *) ptr + 3 * 8)
$41 = 100 'd'
(gdb) p ((char *) ptr + 3 * 8)
$42 = 0x9e6e58 <small_ints+1112> "d"
(gdb) set *((long *) ptr + 3) = 29
(gdb) p *((long *) ptr + 3)
$46 = 29
(gdb) p *((char *) ptr + 3 * 8)
$47 = 29 '\035'
I would like to know if it's possible to get the memory address using Python in the GDB session because I couldn't access the returned addresses:
(gdb) python print("{:#x}".format(ctypes.addressof(ctypes.c_int(29))))
0x7f1053c947f0
(gdb) python print("{:#x}".format(id(29)))
0x22699d8
(gdb) p *0x7f1053c947f0
Cannot access memory at address 0x7f1053c947f0
(gdb) p *0x22699d8
Cannot access memory at address 0x22699d8
The indexing is also different compeering to Python REPL, I guess this is related to endianness?
(gdb) python print(ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[3])
9
(gdb) python print (ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[2])
29
Questions:
info proc mappings
)?src
parameter in the CPython
cast
function holds the address of the object but it seems to be ptr
instead and after memcpy result->b_ptr
points to a different value than &ptr
? Is this were the actual casting happens?(gdb) python
>import ctypes
>print(ctypes.cast(id(29), ctypes.POINTER(ctypes.c_long))[3])
>end
29
I can't think of any reason this behaviour would happen (least of all endianness, which is the same across your entire system*)src
parameter appears to be used as the origin type, rather than the origin object. For reference, see ctypes.h and ctypes/__init__.py (_SimpleCData is just CDataObject with some helpers like indexing and repr). And yes, the memcpy is what does the actual casting in this case, although if you are casting between two data types, there is additional work beforehand.* Except on ARM, where you can change endianness with an instruction