pythonpyarrow

How to append string to each element of chunked array?


Say I have

In [22]: import pyarrow as pa

In [23]: t = pa.table({'a': ['one', 'two', 'three']})

and I'd like to append '_frobenius' to each element of 'a'

Expected output:

pyarrow.Table
a: string
----
a: [["one_frobenius","two_frobenius","three_frobenius"]]

Solution

  • You can use pyarrow.compute.binary_join_element_wise to join a fixed string to each element:

    import pyarrow as pa
    import pyarrow.compute as pc
    
    t = pa.table({"a": ["one", "two", "three"]})
    
    t = t.set_column(0, "a", pc.binary_join_element_wise(t["a"], "_frobenius", ""))
    
    print(t)
    

    Output:

    pyarrow.Table
    a: string
    ----
    a: [["one_frobenius","two_frobenius","three_frobenius"]]