I am working with a large JSON that is stored in a redshift SUPER
type.
This issue is near identical to the question posted here for TSQL. My schema:
chainId BIGINT properties SUPER
Sample data:
{ "chainId": 5, "$browser": "Chrome", "token": "123x5" }
I have this as a column in my table called properties.
I want to be able to retrieve the value 5
from the chainId
key and store it in a BIGINT column.
I have referenced the following aws docs:
https://docs.aws.amazon.com/redshift/latest/dg/JSON_EXTRACT_PATH_TEXT.html
https://docs.aws.amazon.com/redshift/latest/dg/r_SUPER_type.html
https://docs.aws.amazon.com/redshift/latest/dg/super-overview.html
I have tried the following which haven't worked for me:
SELECT
properties.chainId::varchar as test1
, properties.chainId as test2
, properties.chainid as test3
, properties."chainId" as test4
, properties."chainid" as test5
, json_extract_path_text(json_serialize(properties), 'chainId') serial_then_extract
, properties[0].chainId as testval1
, properties[0]."chainId" as testval2
, properties[0].chainid as testval3
, properties[0]."chainid" as testval4
, properties[1].chainId as testval5
, properties[1]."chainId" as testval6
FROM clean
Of these, the attempt, serial_then_extract
returned a not null, correct value, but not all of the values in my properties
field are short enough to serialize, so this only works on some of the rows.
All others return null
.
Referencing the following docs: https://docs.aws.amazon.com/redshift/latest/dg/query-super.html#unnest I have also attempted to iterate over the super type using partisql:
SELECT ps.*
, p.chainId
from clean ps, ps.properties p
where 1=1
But this returns no rows.
I also tried the following:
select properties , properties.token , properties."$os" from base
And this returned rows with values. I know that there is a chainId
value as I've checked the corresponding key and am working with sample data.
What am I missing? What else should I be trying?
Does anyone know if this has to do with the way that the JSON key is formatted? [camelcase]
In your case, the best approach is to enable enable_case_sensitive_super_attribute
like so:
SET enable_case_sensitive_super_attribute to TRUE;
-- Accessing JSON attribute names with uppercase and mixedcase names
For your case, this is better than enable_case_sensitive_identifier
, since that flag affects databases, tables, columns, etc.
https://docs.aws.amazon.com/redshift/latest/dg/super-configurations.html