I have a JSON value
{
"test": {
"userId": 77777,
"sectionScores": [
{
"id": 2,
"score": 244,
},
{
"id": 1,
"score": 212
}
]
}
}
NOTE : the order of sectionScores vary.
id 1 represent sectionname1 and id 2 represent sectionname2 and id 3 represent sectionname3
Maximum 4 sections it may return. It may also return 2 sections.
i have a redshift table
create table stage.poc1(
user_id bigint ,
json_data super
)
from where i write the flatten query and insert into the target table
select user_id, json_data.test[0] ... I dont know how to write
create table target.poc
{
user_id bigint,
sectionname1_score int,
sectionname2_score int,
sectionname3_score int,
sectionname4_score int
}
insert into target.poc
select user_id, json_data.test[0].score as sectionname1_score
... I dont know how to place the value dynamically
thanks in advance
Made a few assumptions but this should get you started.
First note that when working with supers that have uppercase letters you need to enable case sensitivity in Redshift and double quote any names that have uppercase letters.
The easiest way to work with this is to directly convert your json strings to super. Then unnest the arrays into additional rows - see: https://docs.aws.amazon.com/redshift/latest/dg/query-super.html
Below I have a sample SQL that will creates some test data in a CTE (with clause) and then unnests it and splits the values out into the columns you want. Group by and SUM() then aggregates the rows down to the desired result.
SET enable_case_sensitive_identifier TO true;
with data as (
select
json_parse('{
"test": {
"userId": 55555,
"sectionScores": [
{
"id": 2,
"score": 544
},
{
"id": 1,
"score": 512
},
{
"id": 3,
"score": 544
},
{
"id": 4,
"score": 412
}
]
}
}') sp
union all
select
json_parse('{
"test": {
"userId": 77777,
"sectionScores": [
{
"id": 2,
"score": 244
},
{
"id": 1,
"score": 212
}
]
}
}') sp
union all
select
json_parse('{
"test": {
"userId": 66666,
"sectionScores": [
{
"id": 2,
"score": 644
},
{
"id": 1,
"score": 612
},
{
"id": 3,
"score": 612
}
]
}
}') sp
)
select d.sp.test."userId" as user_id,
sum(decode(ss.id,1,ss.score)) as sectionname1_score,
sum(decode(ss.id,2,ss.score)) as sectionname2_score,
sum(decode(ss.id,3,ss.score)) as sectionname3_score,
sum(decode(ss.id,4,ss.score)) as sectionname4_score
from data d, d.sp.test."sectionScores" ss
group by user_id
order by user_id;
The exact table definitions you have aren't completely clear but from here you should be able to make it look as you need.