pythoncpython-cffi

How to pass list of strings to a CFFI extension?


I would like to pass a list of strings to a CFFI extension which expects a char** as input parameter.

Example:

extension.h:

#include <stddef.h>

void sort_strings(char** arr, size_t string_count);

extension.c:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static int cmpstrp(const void *p1, const void *p2) {
    return strlen(*(const char**)p1) < strlen(*(const char**)p2);
}

void sort_strings(char** arr, size_t string_count) {
    qsort(arr, string_count, sizeof(char*), cmpstrp);
}

// `main` is defined only for testing if `sort_strings` works as expected
int main(void) {
    char* string_list[] = {"cat", "penguin", "mouse"};
    size_t string_count = sizeof(string_list)/sizeof(string_list[0]);
    sort_strings(string_list, string_count);
    for (int i = 0; i < string_count; i++) {
        printf("%s\n", string_list[i]);
    }
}

extension_build.py:

from cffi import FFI

ffibuilder = FFI()

ffibuilder.cdef('void sort_strings(char** arr, size_t string_count);')

ffibuilder.set_source('_extension',
    '#include "extension.h"',
    sources=['extension.c'],
    libraries=[])

if __name__ == "__main__":
    ffibuilder.compile(verbose=True)

demo.py:

from _extension.lib import sort_strings
from _extension import ffi


string_list = ["cat", "penguin", "mouse"]
bytes_list = [s.encode("latin1") for s in string_list]
cdata_list = ffi.new("char **", bytes_list)

sort_strings(cdata_list, len(string_list))

# print sorted list

How to test:

python extension_build.py
python demo.py

I have tried passing string_list, bytes_list and cdata_list as first input argument to sort_strings. I get these error messages:

TypeError: initializer for ctype 'char *' must be a cdata pointer, not str
TypeError: initializer for ctype 'char *' must be a cdata pointer, not bytes
TypeError: initializer for ctype 'char *' must be a cdata pointer, not list

How can I pass my list of strings correctly (if possible without copying the list)?

(Just in case my intention is not clear: I'm not asking how to sort a list in Python.)

SOLUTION:

(Based on Armin Rigo's answer.)

It works when demo.py is updated like this:

from _extension.lib import sort_strings
from _extension import ffi


string_list = ["cat", "penguin", "mouse"]
bytes_list = [ffi.new("char[]", s.encode("latin1")) for s in string_list]
pointer = ffi.new("char*[]", bytes_list)

sort_strings(pointer, len(string_list))

for s in pointer:
    print(ffi.string(s).decode("latin1"))

Solution

  • You need to replace

    cdata_list = ffi.new("char **", bytes_list)
    

    with

    cdata_list = ffi.new("char *[]", bytes_list)
    

    because ffi.new("T*") allocates a single T and returns a pointer to it, whereas ffi.new("T[]", length_or_list) allocates an array of multiple T.

    EDIT:

    Ah no, another problem is that you have to allocate each char * explicitly, so you need:

    bytes_list = [ffi.new("char[]", s.encode("latin1")) for s in string_list]
    

    and then this bytes_list can be passed directly as argument to sort_string() (or you can pass cdata_list as above, but CFFI will do the same thing if you just pass bytes_list directly).

    EDIT 2:

    Passing bytes_list or cdata_list is not 100% equivalent: if you make an explicit cdata_list and use it in the call, then you can read from it after the call to learn how the C function changed this C array (as it does in your example). If you pass bytes_list instead, then CFFI converts it to the same C array before the call, but frees the C array immediately after the call.