jsonapache-sparkpysparkexplodeconvertfrom-json

Explode JSON array into rows


I have a dataframe which has 2 columns" "ID" and "input_array" (values are JSON arrays).

ID   input_array
1    [ {“A”:300, “B”:400}, { “A”:500,”B”: 600} ]
2    [ {“A”: 800, “B”: 900} ]

Output that I need:

ID A      B
1  300    400
1  500    600
2  800    900

I tried from_json, explode functions. But data type mismatch error is coming for array columns.


Solution

  • I have 2 interpretations of what input (column "input_array") data types you have.