sqlapache-sparkapache-zeppelin

how to extract data JSON from zeppelin sql


I'd like to see data with JSON type using a query.

The test_tbl table details are here and tags column contains data as JSON type below

%sql
desc stg.test_tbl
col_name | data_type | comment
id       |  string   |
title    |  string   |
tags     |  string   |

Record from tag column: {"name":[{"family": null, "first": "nelson"}, {"pos_code":{"house":"tlv", "id":"A12YR"}}]}

I queried as below and an error that cannot extract value from tags column.

select *, tag.*
from stg.test_tbl as t
lateral view explode(t.tags.name) name as name
lateral view explode(name.pos_code) pos_code as pos_code
Can't extract value from tags#3423: need struct type but got string; line 3 pos 21
set zeppelin.spark.sql.stacktrace = true to see full stacktrace

should I use string in where statement?


Solution

  • Answered myself. I could use get_json_object in string type in JSON.

    In case it's in array as shown below

    {"name":[{"family": null, "first": "nelson"}, {"pos_code":{"house":"tlv", "id":"A12YR"}}]}
    

    Then I could query using the key

    select * from stg.test_tbl as t
    where t.pos_code[0].house = "tlv"