pythonmallocctypeslibclibpq

How to force/test malloc failure in shared library when called via Python ctypes


I have a Python program that calls a shared library (libpq in this case) that itself calls malloc under the hood.

I want to be able to test (i.e. in unit tests) what happens when those calls to malloc fail (e.g. when there isn't enough memory).

How can I force that?

Note: I don't think setting a resource limit on the process using ulimit -d would work. It would need to be be precise and robust enough to, say, make a single malloc call inside libpq, for example one inside PQconnectdbParams, to fail, but all others to work fine, across different versions of Python, and even different resource usages in the same version of Python.


Solution

  • It's possible, but it's tricky. In summary

    In more detail:

    The shared library code

    // In test_override_malloc.c
    // Some of this code is inspired by https://stackoverflow.com/a/10008252/1319998
    
    #define _GNU_SOURCE
    #include <dlfcn.h>
    #include <execinfo.h>
    #include <stddef.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    
    // Fails malloc at the fail_in-th call when search_string is in the backtrade
    // -1 means never fail
    static int fail_in = -1;
    static char search_string[1024];
    
    // To find the original address of malloc during malloc, we might
    // dlsym will be called which might allocate memory via malloc
    static char initialising_buffer[10240];
    static int initialising_buffer_pos = 0;
    
    // The pointers to original memory management functions to call
    // when we don't want to fail
    static void *(*original_malloc)(size_t) = NULL;
    static void (*original_free)(void *ptr) = NULL;
    
    void set_fail_in(int _fail_in, char *_search_string) {
        fail_in = _fail_in;
        strncpy(search_string, _search_string, sizeof(search_string));
    }
    
    void *
    malloc(size_t size) {
        void *memory = NULL;
        int trace_size = 100;
        void *stack[trace_size];
    
        static int initialising = 0;
        static int level = 0;
    
        // Save original
        if (!original_malloc) {
            if (initialising) {
                if (size + initialising_buffer_pos >= sizeof(initialising_buffer)) {
                    exit(1);
                }
                void *ptr = initialising_buffer + initialising_buffer_pos;
                initialising_buffer_pos += size;
                return ptr;
            }
    
            initialising = 1;
            original_malloc = dlsym(RTLD_NEXT, "malloc");
            original_free = dlsym(RTLD_NEXT, "free");
            initialising = 0;
        }
    
        // If we're in a nested malloc call (the backtrace functions below can call malloc)
        // then call the original malloc
        if (level) {
            return original_malloc(size);
        }
        ++level;
    
        if (fail_in == -1) {
            memory = original_malloc(size);
        } else {
             // Find if we're in the stack
            backtrace(stack, trace_size);
            char **symbols = backtrace_symbols(stack, trace_size);
            int found = 0;
            for (int i = 0; i < trace_size; ++i) {
                if (strstr(symbols[i], search_string) != NULL) {
                    found = 1;
                    break;
                }
            }
            free(symbols);
    
            if (!found) {
                memory = original_malloc(size);
            } else {
                if (fail_in > 0) {
                    memory = original_malloc(size);
                }
                --fail_in;
            }
        }
    
        --level;
        return memory;
    }
    
    void free(void *ptr) {
        if (ptr < (void*) initialising_buffer || ptr > (void*)(initialising_buffer + sizeof(initialising_buffer))) {
            original_free(ptr);
        }
    }
    
    

    Compiled with

    gcc -shared -fPIC test_override_malloc.c -o test_override_malloc.so -ldl
    

    Example Python code

    This could go inside the unit tests

    # Inside my_test.py
    
    from ctypes import cdll
    cdll.LoadLibrary('./test_override_malloc.so').set_fail_in(0, b'libpq.so')
    
    # ... then call a function in the shared library libpq.so
    # The `0` above means the very next call it makes to malloc will fail
    

    Run with

    LD_PRELOAD=$PWD/test_override_malloc.so python3 my_test.py
    

    (This might all not be worth it admittedly... if Python calls malloc a lot, I wonder if that in most situations it's unlikely that Python will be fine but just the one call in the library will fail)