pythonpointersmemoryctypespython-dataclasses

Issue regarding overlap of memory allocation for separate dataclasses


I am experimenting with defining dataclasses for containing ctypes objects. I have defined three classes; two of which are used as attributes to the third:

@dataclasses.dataclass
class Point:
    x = ctypes.c_int(0)
    y = ctypes.c_int(0)

@dataclasses.dataclass
class Point2:
    x = ctypes.c_int(0)
    y = ctypes.c_int(0)

@dataclasses.dataclass
class TestObj:
    point1 = Point()
    point2 = Point()

The TestObj class is then passed to a C function inc:

#include <stdio.h>

void inc(int *x, int *y)
{
    *x = *x + 1;
    *y = *y + 1;
}

via a wrapper:

def inc(test: TestObj):
    _inc = lib.inc 

    _inc.argtypes = [ctypes.POINTER(ctypes.c_int), ctypes.POINTER(ctypes.c_int)]
    _inc.restype = None
    _inc(test.point1.x,test.point1.y)

When investigating the result of the pass to inc I observe the snippet:

test = TestObj()
print(f"addressof test.point1.x = {ctypes.addressof(test.point1.x)}")
print(f"test.point1.x.value = {test.point1.x.value}")
print(f"addressof test.point1.y = {ctypes.addressof(test.point1.y)}")
print(f"test.point1.y.value = {test.point1.y.value}")
print(f"addressof test.point2.x = {ctypes.addressof(test.point2.x)}")
print(f"test.point2.x.value = {test.point2.x.value}")
print(f"addressof test.point2.y = {ctypes.addressof(test.point2.y)}")
print(f"test.point2.y.value = {test.point2.y.value}")
inc(test)
print("after pass to inc")
print(f"addressof test.point1.x = {ctypes.addressof(test.point1.x)}")
print(f"test.point1.x.value = {test.point1.x.value}")
print(f"addressof test.point1.y = {ctypes.addressof(test.point1.y)}")
print(f"test.point1.y.value = {test.point1.y.value}")
print(f"addressof test.point2.x = {ctypes.addressof(test.point2.x)}")
print(f"test.point2.x.value = {test.point2.x.value}")
print(f"addressof test.point2.y = {ctypes.addressof(test.point2.y)}")
print(f"test.point2.y.value = {test.point2.y.value}")

results in:

addressof test.point1.x = 124872721403536
test.point1.x.value = 0
addressof test.point1.y = 124872719995280
test.point1.y.value = 0
addressof test.point2.x = 124872721403536
test.point2.x.value = 0
addressof test.point2.y = 124872719995280
test.point2.y.value = 0
after pass to inc
addressof test.point1.x = 124872721403536
test.point1.x.value = 1
addressof test.point1.y = 124872719995280
test.point1.y.value = 1
addressof test.point2.x = 124872721403536
test.point2.x.value = 1
addressof test.point2.y = 124872719995280
test.point2.y.value = 1

the memory locations of the subsequent x and y attributes of each Point instance share the same memory location respectively and that the x and y attributes of each Point instance attributes of the TestObj instance are incremented, instead of just the point1 attributes.

If I redefine TestObj as:

@dataclasses.dataclass
class TestObj:
    point1 = Point()
    point2 = Point2()

and again pass it to the inc wrapper I observe the intended result of:

addressof test.point1.x = 133142487043728
test.point1.x.value = 0
addressof test.point1.y = 133142485635472
test.point1.y.value = 0
addressof test.point2.x = 133142485635728
test.point2.x.value = 0
addressof test.point2.y = 133142485635984
test.point2.y.value = 0
after pass to inc
addressof test.point1.x = 133142487043728
test.point1.x.value = 1
addressof test.point1.y = 133142485635472
test.point1.y.value = 1
addressof test.point2.x = 133142485635728
test.point2.x.value = 0
addressof test.point2.y = 133142485635984
test.point2.y.value = 0

While the x and y attributes of point1 and point2 share the same memory locations repspectively, now only the point1 x and y are incremented as intended.

I am puzzled as to how this is happening, given that in both scenarios the x and y attributes seemingly share the same memory location, but the change in name of the Point attribute provides the intended result. I appreciate any and all help in understanding what is happening.


Solution

  • this is not the intended use of dataclasses - you were just lucky (or unlucky) the Python side of the code didn't error to startwith:

    Upon doing

    @dataclasses.dataclass
    class Point:
        x = ctypes.c_int(0)
        y = ctypes.c_int(0)
    

    You are not indicating that x and y should be instances of ctypes.c_int - you are creating two actual instances of c_int with a definite memory address, as class attributes for Point.

    The dataclass decorator will look at the values, check there are no annotations, and simply annotating that - when a new instance is initialized, the received values should be instances of c_int

    When you instantiate Point without arguments, the dataclass will just leave use the existing, class wide, instances of c_int as the default values.

    It would work if each time you'd instantiate a point (and therefore a test object), you'd pass a new instance of c_int (four new c_ints for each instance of TestOBJ - in other words: it is not just meant to work this way.

    What will create you new c level integers, convert automatically from Python ints to C values, and have nice defaults are c_types.Structure classes you really, like in really should use those instead of dataclasses for interoperating with ctypes.

    It is easy to build a @dataclass like decorator to build Structure classes with the proper _fields_ parameter set, if you prefer that syntax - but dataclass is not that decorator.

    This should work for simple cases instead:

    def datastructure(cls):
        _fields_ = [(k, v) for k, v in cls.__dict__.items() if not k.startswith("_")]
        new_cls = type(cls.__name__, (ctypes.Structure,), {"_fields_": _fields_})
        return new_cls
    

    ANd on the repl:

    
    In [5]: @datastructure
       ...: class Point:
       ...:     x = ctypes.c_int
       ...:     y = ctypes.c_int
       ...: 
    
    In [6]: p = Point(10, 20)
    
    In [7]: p.x
    Out[7]: 10
    
    In [8]: p.y
    Out[8]: 20