joinviewhivehiveqllateral

lateral view explode gives cartesian product


I have a data set like :

{
    "markers":{
        "marker":[
            {
                "id":"3540A",
                "lt":"28.46880448",
                "ln":"77.09402561",
                "speed":"25.7657263185792",
                "brng":"359.3928885",
                "time":"1392975226"
            },
            {
                "id":"3549B",
                "lt":"28.46207019",
                "ln":"77.08945513",
                "speed":"16.7443583120476",
                "brng":"254.4740405",
                "time":"1392975227"
            },
            {
                "id":"3554B",
                "lt":"28.51104306",
                "ln":"77.09399343",
                "speed":"46.8510996103777",
                "brng":"161.7287907",
                "time":"1392975205"
            },
            {
                "id":"3564A",
                "lt":"28.4964115",
                "ln":"77.08691239",
                "speed":"42.9285245904275",
                "brng":"39.96777172",
                "time":"1392975204"
            }
        ]
    }
}

i am using the query :

CREATE TABLE test AS 
SELECT id,speed
FROM raw_data 
LATERAL VIEW explode (markers.marker.id)id as id
LATERAL VIEW explode (markers.marker.speed)speed as speed;

This gives me a cartesian product of id and speed. Can someone suggest how should i get rid of this.


Solution

  • I think this should work:

    SELECT s.id, s.speed FROM raw_data LATERAL VIEW explode(markers.marker) s as s;
    

    Instead of doing multiple lateral views, just do one to get the whole struct, and then select out the individual fields of that struct.