pythonpython-polarspolars

Is there a function to check if a polars dataframe is a view or owns its array?


I'm working with Polars dataframes in Python and need to determine whether a dataframe is a view (referencing another dataframe's memory) or if it owns its own data.

In NumPy, we can use array.flags.owndata to check if an array owns its memory:

import numpy as np
a = np.array([1, 2, 3])
b = a[:]  # view
print(a.flags.owndata)  # True
print(b.flags.owndata)  # False

Is there a built-in method to check ownership status in Polars? If not, is there a recommended workaround?


Solution

  • Polars has no such function. And that is not a simple property a DataFrame has either.

    Internally, a DataFrame consists of Columns. A Column may be either be:

    1. A singular value repeated a bunch of times, or

    2. a ChunkedArray.

    A ChunkedArray consists of one or more Arrays which represent the data if the arrays were concatenated.

    An Array consists of a reference-counted immutable memory buffer (typically containing data in an arrow format). This part may be shared with other Arrays within the same DataFrame, from another DataFrame, or even with foreign code like numpy.

    So within one DataFrame there may be a wide variety of sharing going on, both internally and externally, and there certainly isn't some simple answer to whether the DataFrame has its "own data".


    If not, is there a recommended workaround?

    I have no idea what you're actually trying to accomplish.