This is valid JSON (I've run it against two JSON validators and also parsed it using powershell):
{
"actionCD": "error",
"NotesTXT": "\"Exception call timeout\""
}
This is not valid JSON:
{
"actionCD": "error",
"NotesTXT": "\\"Exception call timeout\\""
}
However, the parse_json function yields a failure with the first example:
SELECT '{ "actionCD": "error", "NotesTXT": "\"Exception call timeout\"" }' as json_str
,PARSE_JSON(json_str) as json;
Error parsing JSON: missing comma, pos 38
And unexpectedly, the snowflake parse_json function works with the invalid json:
SELECT '{ "actionCD": "error", "NotesTXT": "\\"Exception call timeout\\"" }' as json_str
,PARSE_JSON(json_str) as json;
<No Errors>
This is leaving me throughly flummoxxed and uncertain on how to proceed. I'm using powershell programmatically to create valid JSON and then trying to insert valid JSON into snowflake using INSERT INTO ()...SELECT ...
Here is the insert statement I'm trying to build in powershell:
INSERT INTO DBNAME.SCHEMANAME.TABLENAME(
RunID
,jsonLogTXT
) SELECT
'$RunID'
,parse_json('$($mylogdata | ConvertTo-Json)')
;
# where $($mylogdata | ConvertTo-Json) outputs valid json, and from time-to-time includes \" to escape the double quotes.
# But snowflake fails because snowflake wants \\" to escape the double quotes.
Is this expected? (obviously I find it unexpected :-) ). What is the recommendation here? (Should I search my json-stored-as-a-string in powershell for " and replace it with \" before sending it on to snowflake? That feels really hacky, though?)
The code you posted shows the answer:
SELECT '{ "actionCD": "error", "NotesTXT": "\\"Exception call timeout\\"" }' as json_str
,PARSE_JSON(json_str) as json;
JSON_STR | JSON |
---|---|
{ "actionCD": "error", "NotesTXT": ""Exception call timeout"" } | { "NotesTXT": ""Exception call timeout"", "actionCD": "error" } |
What you see is not what "you entered" thus what PARSE_JSON is parsing is what you note is "valid JSON"
The answer is very common to many computer environment, and that is the environment is reading you input, and it acts on some of it, thus the here the SQL parser is reading your SQL and it see the single \
in the valid json and thinks you are starting an escape sequence and then complains about comma's being in the wrong place.
BASH (or PowerShell), Python, even Java requires you to understand the difference between the content of a string (aka you valid JSON) and how you have to represent it so it get past the language parser.
So how should to "INSERT JSON in snowflake" one general answer is not via INSERT commands if it's high volumne. Or if you don't want to make the string parser safe you can BASE64 encode the data (in powershell) and insert base64_decode(awesomestring)
which looks like eyAiYWN0aW9uQ0QiOiAiZXJyb3IiLCAiTm90ZXNUWFQiOiAiXCJFeGNlcHRpb24gY2FsbCB0aW1lb3V0XCIiIH0=
thus
SELECT PARSE_JSON(base64_decode_string('eyAiYWN0aW9uQ0QiOiAiZXJyb3IiLCAiTm90ZXNUWFQiOiAiXCJFeGNlcHRpb24gY2FsbCB0aW1lb3V0XCIiIH0=')) as json_from_B64;
gives:
JSON_FROM_B64 |
---|
{ "NotesTXT": ""Exception call timeout"", "actionCD": "error" } |