I am interested in creating a data structure to hold data/information that is passed around to different functions. The way we currently do it is the following code:
# right now we use a C-struct to hold data that is passed
# around in a separate class 'A'
cdef struct Record:
double threshold
double improvement
cdef class A:
cpdef py_dostuff(self):
cdef Record record
record.threshold = new_threshold
record.improvement = new_improvement
cy_dostuff(&record)
cdef void cy_dostuff(self, Record record) nogil:
do_some_computation(record.threshold, record.improvement)
This uses a C-style struct which unfortunately doesn’t support inheritance so if we wanted to subclass “A” with another class “B” that uses a “subclass” of the struct, it does not work. My attempt at using a class does not work. Ideally, I would be able to do something like the following without sacrificing performance. My thinking is that it should be possible to replace the struct with a purely Cython extension type because I'm only using C-level stuff, but the extension type would enable me to subclass Record
and A
.
# now, I would like to use a Cython extension type to hold data that is passed
# around in a separate class 'A'
cdef class Record:
cdef double threshold
cdef double improvement
cdef class A:
cpdef py_dostuff(self):
cdef Record record
record.threshold = new_threshold
record.improvement = new_improvement
cy_dostuff(&record)
cdef void cy_dostuff(self, Record record) nogil:
do_some_computation(record.threshold, record.improvement)
# The reason I would like to use a Cython extension type is that it can then support clean inheritance of the data structure
cdef class NewRecord(Record):
cdef double threshold
cdef double improvement
cdef int new_attribute
# E.g. a new subclass of 'A' would still work even if all we did was extend the logic to a "NewRecord"
cdef class B(A):
cpdef py_dostuff(self):
cdef NewRecord record
record.threshold = new_threshold
record.improvement = new_improvement
record.new_attribute = new_attribute
cy_dostuff(&record)
cdef void cy_dostuff(self, Record record) nogil:
do_some_computation(record.threshold, record.improvement, record.new_attribute)
My questions are:
How can I substitute a pure-Cython class (with no Python objects to allow nogil operations) in place of the struct correctly?
There's nothing particularly tricky about using a Cython cdef class
in place of the struct - you can pass them to nogil
functions and access their non-object
cdef
attributes without requiring the GIL:
cdef class A:
cdef double threshold
cdef double improvement
def example_func(self):
with nogil:
self.threshold = do_something(self)
cdef double do_something(A a) nogil:
return a.threshold
Please consider whether you actually need to work without the GIL (i.e. you're doing multi-threading). A lot of people thing "nogil==fast" and ask for nogil solutions for largely cargo-cult reasons.
Note that you cannot take an address of a cdef class
. cy_dostuff(&record)
would become cy_dostuff(record)
in your (non-working) example.
Would there be performance differences?
Probably not much. A cdef class
is essentially a struct. Internally by pointer (rather than by value) and allocated on the heap so that might make a small difference. Cython takes care of the details for you though.
If I cannot, why not and what are workarounds to passing struct like data structures around?
You can do "inheritance by composition":
cdef struct Record:
double threshold
double improvement
cdef struct NewRecord:
Record base
int new_attribute
The C standard makes it explicitly allowed to cast between Record
and NewRecord
pointers to support this exact use. So if you have a function that takes a Record
pointer you can do f(<Record*>&my_new_record)