I'm migrating some python C extension to numpy 2. The extension basically gets a list of 2D numpy arrays and generates a new 2D array by combining them (average, median, etc,). The difficulty is that the input and output arrays are byteswapped. I cannot byteswap the input arrays to machine order (they are too many to fit in memory). So
To achieve this (using numpy 1.x C-API) I was using something like:
PyArray_Descr* descr_in = PyArray_DESCR((PyArrayObject*)input_frame_1);
PyArray_CopySwapFunc* swap_in = descr_in->f->copyswap;
PyArray_VectorUnaryFunc* cast_in = PyArray_GetCastFunc(descr_in, NPY_DOUBLE);
bool need_to_swap_in = PyArray_ISBYTESWAPPED((PyArrayObject*)input_frame_1);
And something slightly different but similar for the output. I use the function swap_in
to read a value from the input array, bitswap it and write it into a buffer and then cast_in
to cast the contents of the buffer into a double.
In numpy 2, the copyswap
function is still accesible with a different syntax:
PyArray_CopySwapFunc* swap_in = PyDataType_GetArrFuncs(descr_in)->copyswap;
But the cast
function is not. Although the member is still in the struct, most of its values are NULL. So this doesn't work:
PyArray_VectorUnaryFunc* cast_in = PyDataType_GetArrFuncs(descr_in)->cast[NPY_DOUBLE];
The documentation says
PyArray_GetCastFunc is removed. Note that custom legacy user dtypes can still provide a castfunc as their implementation, but any access to them is now removed. The reason for this is that NumPy never used these internally for many years. If you use simple numeric types, please just use C casts directly. In case you require an alternative, please let us know so we can create new API such as PyArray_CastBuffer() which could use old or new cast functions depending on the NumPy version.
So the function has been removed, but there isn't a clear path to subtitute it with something else. What is the correct way of read and write values from/to bitswapped arrays?
More detailed sample code. It just iterates over the input and saves the value in a double.
double d_val = 0;
char buffer[NPY_BUFSIZE];
PyObject* input_frame_1;
// input_frame_1 is initialized over here
// Conversion
PyArray_Descr* descr_in = PyArray_DESCR((PyArrayObject*)input_frame_1);
PyArray_CopySwapFunc* swap_in = descr_in->f->copyswap;
PyArray_VectorUnaryFunc* cast_in = PyArray_GetCastFunc(descr_in, NPY_DOUBLE);
bool need_to_swap_in = PyArray_ISBYTESWAPPED((PyArrayObject*)input_frame_1);
// Iterator
PyArrayIterObject* iter = PyArray_IterNew(input_frame_1);
// Just reads the value and casts it into a double d_val
while (iter->index < iter->size) {
d_val = 0;
// Swap the value if needed and store it in the buffer
swap_in(buffer, iter->dataptr, need_to_swap_in, NULL);
cast_in(buffer, &d_val, 1, NULL, NULL);
/* Code to advance iter comes here */
}
I have found a solution for my problem using NpyIter
iterators. This type of iterators can be commanded to take care of the buffering and casting that I was doing manually previously.
So my example would be something like:
PyObject* input_frame_1;
// input_frame_1 is initialized over here
/* This var will contain the output */
PyObjecj* out_res = NULL;
/* required to create the iterator */
PyArray_Descr* dtype_res = NULL;
npy_uint32 op_flags[2];
PyArray_Descr*> op_dtypes[2];
PyObject* ops[2];
NpyIter *iter = NULL;
NpyIter_IterNextFunc *iternext;
char** dataptr;
/* I have an input array, the output array
will be automatically allocated.
The input array will be casted into double, and the
output array will be double also
*/
ops[0] = input_frame_1; /* input operand */
ops[1] = NULL; /* output operand will be allocated */
op_flags[0] = NPY_ITER_READONLY | NPY_ITER_NBO;
op_flags[1] = NPY_ITER_WRITEONLY | NPY_ITER_ALLOCATE | NPY_ITER_NBO| NPY_ITER_ALIGNED;
dtype_res = PyArray_DescrFromType(NPY_DOUBLE);
op_dtypes[0] = dtype_res; /* input is converted to double */
op_dtypes[1] = dtype_res; /* output is allocated as double */
iter = NpyIter_MultiNew(2, ops,
NPY_ITER_BUFFERED, /* this must be enabled to allow bitswapping and casting */
NPY_KEEPORDER, NPY_UNSAFE_CASTING,
op_flags, op_dtypes);
Py_DECREF(dtype_res);
dtype_res = NULL;
if (iter == NULL) {
return NULL; /* you will get and error if arrays are not compatible */
}
/* Specific methods to advance the loop and get the data */
iternext = NpyIter_GetIterNext(iter, NULL);
dataptr = NpyIter_GetDataPtrArray(iter);
do {
double *dbl_ptr;
double value;
/* Now dataptr contains correctly formated data */
/* the input */
dbl_ptr = (double*) dataptr[0];
value = *dbl_ptr;
/* lets say our operation is b = 2 * a + 1 */
value = 2 * value + 1;
/* and the output, stored in the other pointer */
memcpy(dataptr[1], &value, sizeof(double));
} while(iternext(iter));
/* The output array cab be recovered with */
out_res = NpyIter_GetOperandArray(iter)[1];
NpyIter_Deallocate(iter);
There are lots of different flags, per operand and per loop. For example, you can use NPY_ITER_COPY
instead of NPY_ITER_BUFFERED
, or different rules for casting (or no casting), disallow broadcasting, get larger chunks of data for external loops, etc.
Full documentation is here: https://numpy.org/doc/stable/reference/c-api/iterator.html
Update: Altough this solution works well, NpyIter_MultiNew
is limited to have NPY_MAXARGS
inputs, being 32/64 in numpy 1/2. If you (as it's my case) need more inputs, this solution is not enough.