pythoncctypesccpython-3.10

Anyway to pass string containing compiled code instead of file path to ctypes.CDLL?


Background

I am trying to call C functions inside python and discovered the ctypes library (I'm fairly new to both C and python's ctypes), motive (however stupid) is to make python code's speed on par with c++ or close enough on a competitive website. I have written the C code and made a shared library with the following command cc -fPIC -shared -o lib.so test.c and imported it into python with ctypes using the following code:

import ctypes


def main():
    clib = ctypes.CDLL('./lib.so')
    # ... irrelevant code below ...

main()

Problem

The problem is that this code needs to run in an environment that I don't have control over, that is:

  1. permission denied when trying to create files
  2. no access to internet

what I already tried

  1. I tried putting my lib.so on github and downloading it but due to the above reasons this solution fails.
  2. I tried to pickle the clib variable on my machine in the hopes that I could store the serialized code inside a string in the program itself and then unpickle in the restricted environment. But this doesn't work since the clib object is not serializable by pickle.

The last solution in my mind is to store the contents of lib.so inside a string in the program but then the problem arises:

#...
def main():
    lib_contents = b"contents of the lib.so file"
    clib = ctypes.CDLL(lib_contents) # passing the contents of the file instead of the file path
# ...

How do I achieve the above solution or any alternative solutions?

Edit: the suggested answer is not working (I don't know what I'm doing here so quite likely something wrong on my part). Here is the code that I am running copied from the question plus the accepted answer:

import ctypes
from ctypes import *

# Initialise ctypes prototype for mprotect().
# According to the manpage:
#     int mprotect(const void *addr, size_t len, int prot);
libc = CDLL("libc.so.6")
mprotect = libc.mprotect
mprotect.restype = c_int
mprotect.argtypes = [c_void_p, c_size_t, c_int]

# PROT_xxxx constants
# Output of gcc -E -dM -x c /usr/include/sys/mman.h | grep PROT_
#     #define PROT_NONE 0x0
#     #define PROT_READ 0x1
#     #define PROT_WRITE 0x2
#     #define PROT_EXEC 0x4
#     #define PROT_GROWSDOWN 0x01000000
#     #define PROT_GROWSUP 0x02000000
PROT_NONE = 0x0
PROT_READ = 0x1
PROT_WRITE = 0x2
PROT_EXEC = 0x4

# Machine code of an empty C function, generated with gcc
# Disassembly:
#     55        push   %ebp
#     89 e5     mov    %esp,%ebp
#     5d        pop    %ebp
#     c3        ret
with open("./libsum.so", "rb") as file:
    raw = file.read()
code = ctypes.create_string_buffer(raw)

# Get the address of the code
address = addressof(c_char_p(code))

# Get the start of the page containing the code and set the permissions
pagesize = 0x1000
pagestart = address & ~(pagesize - 1)
if mprotect(pagestart, pagesize, PROT_READ | PROT_WRITE | PROT_EXEC):
    raise RuntimeError("Failed to set permissions using mprotect()")

# Generate ctypes function object from code
functype = CFUNCTYPE(None)
f = functype(address)

# Call the function
print("Calling f()")
f()

I get the following error:

Traceback (most recent call last):
  File "/home/user/main.py", line 36, in <module>
    address = addressof(c_char_p(code))
TypeError: bytes or integer address expected instead of c_char_Array_15697 instance

Solution

  • from ctypes import *
    
    # int add(int x, int y)
    # {
    #   return (x+y);
    # }
    code = b'\x55\x48\x89\xe5\x89\x7d\xfc\x89\x75\xf8\x8b\x55\xfc\x8b\x45' \
           b'\xf8\x01\xd0\x5d\xc3'
    
    copy = create_string_buffer(code)
    address = addressof(copy)
    aligned = address & ~0xfff
    size = 0x2000
    prototype = CFUNCTYPE(c_int, c_int, c_int)
    add = prototype(address)
    pythonapi.mprotect(c_void_p(aligned), size, 7)
    print(add(20, 30))
    

    Explanation: the code was compiled as a shared library with cc -shared -o libadd.so add.c and the binary code was extracted with objdump -S. It was placed in a binary string (byte) object. A copy created with create_string_buffer() (suitable for retrieving its address) was created, its address retrieved and mprotect() was called with the 2 virtual pages corresponding to the region where the buffer is allocated and the protection value 7 (== read + write + execute). At this point the function was ready for use and it was called (add(20, 30)). The result 50 is printed.