The DuckDB documentation states that you can create a table starting from a JSON file, with DuckDB automatically constructing the table schema, for instance:
CREATE TABLE todos AS
SELECT * FROM 'todos.json';
But how can I pass JSON as a string?
I have tried:
CREATE TABLE todos AS SELECT * FROM json('[{"familiy": "Mark"}, {"family": "Nej"}]');
But it says:
Catalog Error:
Table Function with name "json" is not in the catalog, a function by this name exists in the json extension, but it's of a different type, namely Macro Function
The use case is that I would like to save data from OpenAlex, for example the list of works of Claudia Goldin, and it would be good to have DuckDB create the schema itself without recurring to manually define a SQLAlchemy ORM table.
json_structure
as structure for from_json
(working in DuckDB 1.2.2)
Thanks to this article from SeaChess, it appeared that asking json_structure
to analyze the items of the JSON array, then passing it to from_json
, will generate a struct with all fields present:
create table todos as
select j.* from (select unnest(from_json('[{"familiy": "Mark"}, {"family": "Nej"}]', json_structure('[{"familiy": "Mark"}, {"family": "Nej"}]'))) j);
select * from todos;
Note that "JSON structure must be a constant!", thus the necessity to pass the JSON string (you cannot call it with a column of an existing table as parameter).
A possibility (used in the aforementioned article) is to put the structure in a variable, which in the final query will be considered as a constant:
create temp table d as select json('[{"familiy": "Mark"}, {"family": "Nej"}]') d;
set variable json_schema = (select json_structure(d) from d);
create table todos as
select j.* from (select unnest(from_json(d, getvariable('json_schema'))) j from d);
select * from todos;
familiy (varchar ) |
family (varchar ) |
---|---|
Mark | NULL |
NULL | Nej |