I've got a large JSON field that I'm trying to parse out using JSON_TABLE. About half the data is straightforward, like:
"run_summary":{"num_records":6,"num_runtime_errors":0,"num_checkpoint_failed":4,"num_files":4}
But the other half contains escaped quotes, such as:
"current_review_id":"{\"value\": \"b13d2f43-7abb-40f8-9867-6349c0c84c47\", \"recorded_timestamp\": 1696865721868}"
I have no trouble breaking out the columns from the run_summary, i.e.
num_records int PATH '$.run_summary.num_records'
However, I can't figure out how to get at the ones inside the escaped quotes...this does not work, for example:
value varchar(40) PATH '$.flow_job_metrics.current_review_id.value'
...which just returns null.
I have tried putting additional quotes inside the path, like:
'$.flow_job_metrics.current_review_id."value"'
and other crazy things like:
'$.flow_job_metrics.current_review_id.\"value\"'
but these either result in syntax errors or more null values.
Can anyone help me figure out how to get at these?
Since the JSON is actually double-encoded, that is, part of your JSON is just a string, which is part of the "outer" JSON document, you'll have to double-decode it:
create table mytable (
id serial primary key,
data json
);
insert into mytable set data = '{"current_review_id":"{\\"value\\": \\"b13d2f43-7abb-40f8-9867-6349c0c84c47\\", \\"recorded_timestamp\\": 1696865721868}"}';
select id, j1.current_review_id, j2.value, j2.recorded_timestamp from mytable
cross join json_table(data, '$' columns (
current_review_id json path '$.current_review_id'
)) as j1
cross join json_table(json_unquote(j1.current_review_id), '$' columns (
value varchar(36) path '$.value',
recorded_timestamp bigint path '$.recorded_timestamp'
)) as j2;
Output, tested on MySQL 8.4.3:
*************************** 1. row ***************************
id: 1
current_review_id: "{\"value\": \"b13d2f43-7abb-40f8-9867-6349c0c84c47\", \"recorded_timestamp\": 1696865721868}"
value: b13d2f43-7abb-40f8-9867-6349c0c84c47
recorded_timestamp: 1696865721868