I have an Sqlite table triples
that contains triple information in { id, rel, tgt }
format [1]. I would like to create a view that exposes this triple-format data to "object format", which is more easily consumed by applications reading from this database. In theory sqlite's JSON1 extension would allow me to construct such objects, but I'm struggling.
My current query
select distinct json_object(
'id', id,
rel, json_group_array(distinct tgt)
) as entity from content
group by src, rel, tgt
order by src, rel, tgt
does not work correctly. It produces objects like
{ id: 'a', 'is': ['b'] }
{ id: 'a', 'is': ['c'] }
Rather than
{ id: 'a', 'is': ['b', 'c'] }
It also produces duplicate keys like
{ id: 'a', id: ['a'] }
This is closer, but does not handle IDs correctly. It constructs an array, not a string
create view if not exists entity as
select distinct json_group_object(
rel, json_array(distinct tgt)
) as entity from content
group by src
I think iif
might help
Can you help me adjust my query to produce correct output (see below)? Please comment if anything needs disambiguation or clarification
Input:
Triple Format:
id | rel | tgt
-----------------------
Bob | is | Bob
Bob | is | Person
Bob | age | 20
Bob | likes | cake
Bob | likes | chocolate
Alice | id | Alice
Alice | is | Person
Alice | hates | chocolate
Output:
Object Format [2]:
{
id: Bob,
is: [ Person ],
age: [ 20 ],
likes: [ cake, chocolate ]
}
{
id: Alice,
is: [ Person ],
hates: [ chocolate ]
}
[1] This dataset has unpredictable structure; I can assume no prior knowledge of what 'rel' keys exist beyond id
. A triple <src> id <src>
will exist for every src
parameter.
[2] The objects should have the following format. id
must not be overwritten.
{
id: <id>
<distinct rel>: [
< tgt >
]
}
CREATE TABLE content (
id VARCHAR(32),
rel VARCHAR(32),
tgt VARCHAR(32)
);
INSERT INTO
content
VALUES
('Bob' , 'id' , 'Bob'),
('Bob' , 'is' , 'Person'),
('Bob' , 'age' , '20'),
('Bob' , 'likes', 'cake'),
('Bob' , 'likes', 'chocolate'),
('Alice', 'id' , 'Alice'),
('Alice', 'is' , 'Person'),
('Alice', 'hates', 'chocolate')
WITH
id_rel AS
(
SELECT
id,
rel,
JSON_GROUP_ARRAY(tgt) AS tgt
FROM
content
GROUP BY
id,
rel
)
SELECT
JSON_GROUP_OBJECT(
rel,
CASE WHEN rel='id'
THEN JSON(tgt)->0
ELSE JSON(tgt)
END
)
AS entity
FROM
id_rel
GROUP BY
id
ORDER BY
id
entity |
---|
{"hates":["chocolate"],"id":"Alice","is":["Person"]} |
{"age":["20"],"id":"Bob","is":["Person"],"likes":["cake","chocolate"]} |
You must aggregate in two steps, as your edited code doesn't combine cake
and chocolate
in to a single array of two elements...