Empty object instantiation in Cython

I am working on a Cython interface to a C library. I have this class:

cdef class CholeskyFactor:
    cdef cholmod_factor *_factor
    # ... etc.

    def __cinit__(self, A, *, order=None, **kwargs):
         # allocate and set factor to non-null value
         self._factor = create_cholmod_factor(A, order)

I'd like to create a .copy() method for the class. Since _factor is a pointer to an underlying C data structure, I can't just use import copy; copy.deepcopy(). In order to create a copy, I would like to instantiate a "blank" version of my class, then call a C function to copy over the memory appropriately:

def copy(self):
    cdef CholeskyFactor cf = CholeksyFactor.__new__(CholeskyFactor)
    cf._factor = cholmod_copy_factor(self._factor)
    # ... etc. copy other members
    return cf

In order for the object to be in a "safe" (and usable) state where self._factor != NULL, we need a non-null A argument to the constructor. To create the copy, we need a "blank" constructor with A=None that sets self._factor = NULL and returns. The issue, however, is that the "blank" constructor leaves the object in a non-safe state with self._factor = NULL:

# Blank constructor
def __cinit__(self, A=None, **kwargs):
    self._factor = NULL
    if A is None:
        return
    else:
        self._factor = create_cholmod_factor(A, order)
        # ... etc.

It seems like my only option is to allow A=None in __cinit__, and add an inline check to every property and method in the class that raises ValueError and tells the user to instantiate with CholeskyFactor(A) instead.

Is there some way I haven't found in Cython documentation to instantiate a "blank" object without calling __cinit__? Or a more idiomatic way to prevent users from instantiating the class directly without the A argument (and thus leaving it in an un-safe state)?

Solution

There isn't really a perfect way to do this. __cinit__ is basically intended to be a single function that gets run exactly once and sets up all your class invariants. But that does mean that you can't skip it.

One option would be to use a keyword argument - something like:

def __cinit__(self, A, **kwargs):
    if A is None:
        copy_from = kwargs['_copy_from']
        ...
    else:
        ...

So you essentially say that either you pass a valid value of A, or you pass _copy_from as a keyword argument (I've used an underscore just to indicate that it's somewhat "internal"), but if you do neither then it raises an exception and the invariants are preserved.

A second, softer, option would be to define both __cinit__ and __init__. __cinit__ would leave the class in an unsafe NULL state, and __init__ would complete the initialization. Of course this would mean your users can also leave it in an unsafe NULL state if they tried hard enough and called __new__ directly. But depending on who they are, you could document it and make it their fault. That isn't perfectly safe, but it at least makes it hard to misuse.

Personally I'd probably choose option 1, but that isn't always easy to make work with every interface.