pythonpandaspy-datatable

Convert python datatable column into a list column


How can I convert this python datatable

have = dt.Frame(id=[1,1,2,2],val=["a","b","c","d"])

into this python datatable?

want = dt.Frame(id = [1,2], val=[["a","b"],["c","d"]], types=[dt.Type.int32,dt.Type.obj64])

Specifically, I am trying to find the datatable equivalent of this operation in pandas:

have = pd.DataFrame({"id":[1,1,2,2],"val":["a","b","c","d"]})
have.groupby("id")["val"].apply(list).reset_index()

   id     val
0   1  [a, b]
1   2  [c, d]

Solution

  • Here is one option, but I'm hoping someone can help with a better solution:

    dt.rbind([dt.Frame(
        id = i,
        val=have[f.id==i,f.val].to_list(),
        types=[dt.Type.int32, dt.Type.obj64]
    ) for i in np.unique(have["id"])])