I have a data set like :
{
"markers":{
"marker":[
{
"id":"3540A",
"lt":"28.46880448",
"ln":"77.09402561",
"speed":"25.7657263185792",
"brng":"359.3928885",
"time":"1392975226"
},
{
"id":"3549B",
"lt":"28.46207019",
"ln":"77.08945513",
"speed":"16.7443583120476",
"brng":"254.4740405",
"time":"1392975227"
},
{
"id":"3554B",
"lt":"28.51104306",
"ln":"77.09399343",
"speed":"46.8510996103777",
"brng":"161.7287907",
"time":"1392975205"
},
{
"id":"3564A",
"lt":"28.4964115",
"ln":"77.08691239",
"speed":"42.9285245904275",
"brng":"39.96777172",
"time":"1392975204"
}
]
}
}
i am using the query :
CREATE TABLE test AS
SELECT id,speed
FROM raw_data
LATERAL VIEW explode (markers.marker.id)id as id
LATERAL VIEW explode (markers.marker.speed)speed as speed;
This gives me a cartesian product of id and speed. Can someone suggest how should i get rid of this.
I think this should work:
SELECT s.id, s.speed FROM raw_data LATERAL VIEW explode(markers.marker) s as s;
Instead of doing multiple lateral views, just do one to get the whole struct, and then select out the individual fields of that struct.