pythonpyarrow

pyarrow: find diff for chunkedarray


If I have a chunkedarray, how do I find its diff (similar to pandas.Series.diff or polars.Series.diff)?

e.g. if I start with

import pyarrow as pa
ca = pa.chunked_array([[1,3, 2], [5, 2, 1]])

I'd like to end up with an array (or chunked array) with values: [null, 2, -1, 3, -1, -1]


Solution

  • import pyarrow.compute as pc
    
    pc.pairwise_diff(ca.combine_chunks())