I have a table similar to this, where I need to pull the value out of each JSON field that has an id equal to 2. I've tried multiple examples, but cannot manage to wrap my head around how to manipulate json_tuples / lateral views / explodes into my scenario.
value1,value2,value3,"[{""id"":1,""value"":""x""},{"id"":2,""value"":""y""}, {"id"":3,""value"":""blah""}]"
value4,value5,value6,"[{""id"":1,""value"":""a""},{"id"":2,""value"":""b""}, {"id"":3,""value"":""blahblah""}]"
col1 | col2 | col3 | id2value |
---|---|---|---|
value1 | value2 | value3 | y |
value4 | value5 | value6 | b |
Examples Ive tried to manipulate: How to extract selected values from json string in Hive
Hive Sql Query To get Json Object from Json Array -dont know how to combine this with selecting the normal columns
Your example is a little unclear, so I'm making some guesses and assumptions (the value* strings are in separate columns, your json string is stored in an array column,... And your quotes in your json array are kind of a mess, that's not valid json.)
You can use lateral view on your json array (technically an array of structs), and then you can filter on that.
select
col1,
col2,
col3,
exp.id,
exp.value
from
<your table>
lateral view inline(<array column>) exp as id,value
where
exp.id = 1
EDIT: If your json is just stored as a string array(of valid json), you have to go through some extra hoops. You convert it to an array with your regex and explode. Then you can pull the individual elements out using json_tuple. If there's a better/less ugly way to do this, I don't know it.
select
col1,
col2,
col3,
t2.*,
jsonstring,
t.col
from
<your table>
--first strip the [] off so hive can see this as an array
LATERAL VIEW explode(
split(
regexp_replace(yourtable.jsonstring,'^\\[|\\]$',''), '(?<=\\}),(?=\\{)'
)
) t as col
--and then use json_tuple to get the elements from the array
lateral view json_tuple(t.col,'id','value') t2 as id,value
where
t2.id = 2