I have a dataframe and build a nested json object from this dataframe to represent the hieraical data, i am stuck where the json sub column is aded but its comming as string not as json.
**Code:**
from pyspark.sql.functions import *
#sample data
df=spark.createDataFrame([('1234567','123 Main St','10SjtT','idk@gmail.com','ecom','direct')],['cust_id','address','store_id','email','sales_channel','category'])
i want to represent this dataframe into this format
{
"store_id": "10SjtT",
"category": "direct",
"sales_channel": "ecom",
"email": "idk@gmail.com",
"c_email": {
"category": "direct",
"email": "idk@gmail.com"
}
}
i trid to add column but my sample code adds the nested json as a string with quotations
{
"store_id":"10SjtT","category":"direct","sales_channel":"ecom"
,"c_email":"{\"category\":\"direct\",\"email\":\"idk@gmail.com\"}"
}
**Code used to build this **
dff = df.select("cust_id","address",to_json(struct("store_id","category","sales_channel","email",to_json(struct( "category" ,"email")).alias("c_email"))).alias("metadata"))
dff.select("metadata").show(10,False)
Please let me know if anyone faced the similar issue and able to build nested json and carrying the json format across.
Thanks in advance.
Manoj.
Remove the internal nested to_json()
call.
from pyspark.sql.functions import *
#sample data
df=sqlContext.createDataFrame([('1234567','123 Main St','10SjtT','idk@gmail.com','ecom','direct')],['cust_id','address','store_id','email','sales_channel','category'])
dff = df.select("cust_id","address",to_json(struct("store_id","category","sales_channel","email",struct( "category" ,"email").alias("c_email"))).alias("metadata"))
dff.select("metadata").show(10,False)
Output:
+------------------------------------------------------------------------------------------------------------------------------------------------+
|metadata |
+------------------------------------------------------------------------------------------------------------------------------------------------+
|{"store_id":"10SjtT","category":"direct","sales_channel":"ecom","email":"idk@gmail.com","c_email":{"category":"direct","email":"idk@gmail.com"}}|
+------------------------------------------------------------------------------------------------------------------------------------------------+