pythontuplesctypescreationself-reference

Building Self-Referencing Tuples


After seeing a conversation in a forum from many years ago that was never resolved, it caused me to wonder how one would correctly create a tuple that referenced itself. Technically, this is a very bad idea since tuples are supposed to be immutable. How could an immutable object possibly contain itself? However, this question is not about best practices but is a query regarding what is possible in Python.

import ctypes

def self_reference(array, index):
    if not isinstance(array, tuple):
        raise TypeError('array must be a tuple')
    if not isinstance(index, int):
        raise TypeError('index must be an int')
    if not 0 <= index < len(array):
        raise ValueError('index is out of range')
    address = id(array)
    obj_refcnt = ctypes.cast(address, ctypes.POINTER(ctypes.c_ssize_t))
    obj_refcnt.contents.value += 1
    if ctypes.cdll.python32.PyTuple_SetItem(ctypes.py_object(array),
                                            ctypes.c_ssize_t(index),
                                            ctypes.py_object(array)):
        raise RuntimeError('PyTuple_SetItem signaled an error')

The previous function was designed to access the C API of Python while keeping internal structures and datatypes in mind. However, the following error is usually generated when running the function. Through unknown processes, it has been possible to create a self-referencing tuple via similar techniques before.

Question: How should the function self_reference be modified to consistently work all of the time?

>>> import string
>>> a = tuple(string.ascii_lowercase)
>>> self_reference(a, 2)
Traceback (most recent call last):
  File "<pyshell#56>", line 1, in <module>
    self_reference(a, 2)
  File "C:/Users/schappell/Downloads/srt.py", line 15, in self_reference
    ctypes.py_object(array)):
WindowsError: exception: access violation reading 0x0000003C
>>> 

Edit: Here are two different conversations with the interpreter that are somewhat confusing. The code up above appears to be correct if I understand the documentation correctly. However, the conversations down below appear to both conflict with each other and the self_reference function up above.

Conversation 1:

Python 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]
on win32
Type "copyright", "credits" or "license()" for more information.
>>> from ctypes import *
>>> array = tuple(range(10))
>>> cast(id(array), POINTER(c_ssize_t)).contents.value
1
>>> cast(id(array), POINTER(c_ssize_t)).contents.value += 1
>>> cast(id(array), POINTER(c_ssize_t)).contents.value
2
>>> array
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
>>> cdll.python32.PyTuple_SetItem(c_void_p(id(array)), 0,
                                  c_void_p(id(array)))
Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    cdll.python32.PyTuple_SetItem(c_void_p(id(array)), 0,
                                  c_void_p(id(array)))
WindowsError: exception: access violation reading 0x0000003C
>>> cdll.python32.PyTuple_SetItem(c_void_p(id(array)), 0,
                                  c_void_p(id(array)))
Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    cdll.python32.PyTuple_SetItem(c_void_p(id(array)), 0,
                                  c_void_p(id(array)))
WindowsError: exception: access violation reading 0x0000003C
>>> array
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
>>> cdll.python32.PyTuple_SetItem(c_void_p(id(array)), 0,
                                  c_void_p(id(array)))
0
>>> array
((<NULL>, <code object __init__ at 0x02E68C50, file "C:\Python32\lib
kinter\simpledialog.py", line 121>, <code object destroy at 0x02E68CF0,
file "C:\Python32\lib   kinter\simpledialog.py", line 171>, <code object
body at 0x02E68D90, file "C:\Python32\lib      kinter\simpledialog.py",
line 179>, <code object buttonbox at 0x02E68E30, file "C:\Python32\lib
kinter\simpledialog.py", line 188>, <code object ok at 0x02E68ED0, file
"C:\Python32\lib        kinter\simpledialog.py", line 209>, <code object
cancel at 0x02E68F70, file "C:\Python32\lib    kinter\simpledialog.py",
line 223>, <code object validate at 0x02E6F070, file "C:\Python32\lib
kinter\simpledialog.py", line 233>, <code object apply at 0x02E6F110, file
"C:\Python32\lib     kinter\simpledialog.py", line 242>, None), 1, 2, 3, 4,
5, 6, 7, 8, 9)
>>>

Conversation 2:

Python 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]
on win32
Type "copyright", "credits" or "license()" for more information.
>>> from ctypes import *
>>> array = tuple(range(10))
>>> cdll.python32.PyTuple_SetItem(c_void_p(id(array)), c_ssize_t(1),
                                  c_void_p(id(array)))
0
>>> array
(0, (...), 2, 3, 4, 5, 6, 7, 8, 9)
>>> array[1] is array
True
>>>

Solution

  • Thanks to nneonneo's help, I settled on the following implementation of the self_reference method.

    import ctypes
    
    ob_refcnt_p = ctypes.POINTER(ctypes.c_ssize_t)
    
    class GIL:
        acquire = staticmethod(ctypes.pythonapi.PyGILState_Ensure)
        release = staticmethod(ctypes.pythonapi.PyGILState_Release)
    
    class Ref:
        dec = staticmethod(ctypes.pythonapi.Py_DecRef)
        inc = staticmethod(ctypes.pythonapi.Py_IncRef)
    
    class Tuple:
        setitem = staticmethod(ctypes.pythonapi.PyTuple_SetItem)
        @classmethod
        def self_reference(cls, array, index):
            if not isinstance(array, tuple):
                raise TypeError('array must be a tuple')
            if not isinstance(index, int):
                raise TypeError('index must be an int')
            if not 0 <= index < len(array):
                raise ValueError('index is out of range')
            GIL.acquire()
            try:
                obj = ctypes.py_object(array)
                ob_refcnt = ctypes.cast(id(array), ob_refcnt_p).contents.value
                for _ in range(ob_refcnt - 1):
                    Ref.dec(obj)
                if cls.setitem(obj, ctypes.c_ssize_t(index), obj):
                    raise SystemError('PyTuple_SetItem was not successful')
                for _ in range(ob_refcnt):
                    Ref.inc(obj)
            finally:
                GIL.release()
    

    To use the method, follow the example shown down below for creating your own self-referencing tuples.

    >>> array = tuple(range(5))
    >>> Tuple.self_reference(array, 1)
    >>> array
    (0, (...), 2, 3, 4)
    >>> Tuple.self_reference(array, 3)
    >>> array
    (0, (...), 2, (...), 4)
    >>>