pythonparallel-processingcythongil

Why does Cython consider cdef classes as Python objects?


I have been trying to optimize a code segment by making calls to functions inside it parallel using prange. This requires that all the functions inside the prange block run with nogil, so I am in the process of adapting them to not use the GIL. However, when trying to adapt one of the functions I'm running into a problem regarding Python Locals/Temporaries. The function is below:

cdef float output(Branch self):
        cdef float output = 0.0
        cdef Dendrite dendrite   # This line is considered a Python Local
        cdef Branch branch       # This line is considered a Python Local
        cdef int index = 0
        while index < self.length:
            if self.isDendrite:
                dendrite = self.inputs[index]
                output += dendrite.output()
            else:
                branch = self.inputs[index]
                output += branch.output()

            index += 1
        return self.activation.eval(output) * self.weight

When trying to convert the function to run nogil, the following error message is returned by Cython:

Function declared nogil has Python locals or temporaries

pointing to the function header.

For context, these are the fields that the Branch and Dendrite classes own (Node is another cdef class that is referenced by Dendrite):

cdef class Branch():
    cdef:
        np.ndarray inputs     # Holds either Dendrite or Branch objects (but never both)
        float weight
        floatFunc activation
        bint isDendrite       # Used to determine if Dendrites or Branches are held
        int length

cdef class Dendrite():
    cdef:
        float charge
        Node owner            # The class below
        float weight
        np.ndarray links      # Holds objects that rely on Node and Dendrite
        floatFunc activation  # C-optimized class
        int length

cdef class Node:
    cdef:
        float charge
        float SOMInhibition
        np.ndarray senders         # Holds objects that rely on Node and Dendrite
        int numSenders
        np.ndarray position        # Holds ints
        NodeType type              # This is just an enum
        np.ndarray dendrites       # This holds Dendrite objects
        int numDendrites
        np.ndarray inputDendrites  # Holds Dendrite objects
        int numInputDendrites
        np.ndarray inputBranches   # Holds Branch objects
        int numInputBranches
        int ID
        floatFunc activation       # C-optimized class

My guess is that this has something to do with the fact that the classes have NumPy arrays as fields, but NumPy is compatible with Cython and should not be making Python objects (if I understood correctly).

How can I make it so that those lines are not counted as Python objects?

It has been mentioned that untyped NumPy arrays do not provide much benefit in terms of performance. In the past I tried to type them during class declaration, but Cython threw a compile error when it saw the type identifiers in the class declaration. I do type the arrays in the initializer before assigning them to fields though, so does that still work, or is the typing during initialization not relevant?


Solution

  • The major problem your program is having here is that your cdef classes' definitions have an attribute that is a container of cdef classes, and, the accessing of the element of the container of cdef classes is not allowed without GIL. According to the codes you are giving here, you want to do Object-Oriented Computation. Unfortunately, it is currently unsupported or safe if you want to achieve this and highly parallelizable at the same time by only Cython and Python codes.

    References:

    cython: how do you create an array of cdef class

    https://groups.google.com/forum/#!topic/cython-users/G8zLWrA-lU0

    One of the safest solution (requires more coding) is to implement the Object-Oriented Computational part, i.e., class Branch, Dendrite and Node in C++ and declare and import them into Cython as cppclasses following the guide in here. Then, manipulate the classes as nogil cppclasses, instead of cdef classes. In the cppclass, you can declare the container as a pointer to the cppclass.

    Example:

    test_class.h

    #ifndef TEST_CLASS_H
    #define TEST_CLASS_H
    
    
    class test_class {
        public:
            test_class * container;
            int b;
            test_class();
            test_class(int, int);
            test_class(int);
            ~test_class();
            int get_b();
            test_class* get_member(int);
    };
    
    #endif
    

    test_class.cpp

    #include "test_class.h"
    #include <cstddef>
    
    test_class::test_class(){
        this->b = 0;
        this->container = NULL;
    }
    
    test_class::test_class(int b){
        this->b = b;
        this->container = NULL;
    }
    
    test_class::test_class(int b, int n){
        this->b = b;
        if (n > 0){
            this->container = new test_class[n];
            for (int i = 0; i < n; i++) {
                this->container[i].b = i;
            }
        }
    }
    
    test_class::~test_class(){
            delete[] this->container;
    }
    int test_class::get_b(){
        return this->b;
    }
    test_class* test_class::get_member(int i){
        return &(this->container[i]);
    }
    

    Cython codes:

    # distutils: language = c++
    
    cdef extern from "test_class.cpp":
        pass
    
    cdef extern from "test_class.h":
        cdef cppclass test_class nogil: #add nogil here
            test_class * container
            int b
            test_class() except +
            test_class(int) except +
            test_class(int, int) except +
            int get_b()
            test_class * get_member(int)
    
    cdef int inner_function(test_class * class_obj) nogil: #nogil is allowed
        cdef int i
        cdef int b = 0
        cdef test_class * inner_obj
    
        for i in range(10):
            inner_obj = class_obj.get_member(i)
            b += inner_obj.get_b()
        return b
    
    def outer_function():
      cdef test_class * class_obj = new test_class(999, 10)
      cdef int sum_b = inner_function(class_obj)
      del class_obj
      return sum_b #output: 45
    

    Finally, regarding the main question: Why does Cython consider cdef classes as Python objects? Because they want to provide an intermediate between Python Class and Cppclass, that is called cdef class(Cython class). Probably in the future, they will support defining cppclasses in Cython as a standard feature with documentation.

    References:

    https://www.nexedi.com/NXD-Document.Blog.Cypclass