In Snowflake, I'm trying to save a large amount of code size and maintenance by creating a UDTF to show flags for each row v
based on presence of rows in x
or y
meeting some pattern.
The error I get is SQL Error [2031] [42601]: SQL compilation error: Unsupported subquery type cannot be evaluated
Here's some minimal example code:
DROP TABLE IF EXISTS v;
CREATE TABLE v (CHARTGUID NUMBER(16));
INSERT INTO v VALUES(1410522400170);
INSERT INTO v VALUES(1548089000170);
DROP TABLE IF EXISTS x;
CREATE TABLE x (CHARTGUID NUMBER(16), FOO INT, BAR VARCHAR(1));
INSERT INTO x(CHARTGUID, FOO, BAR) VALUES (1410522400170, 1, 'a');
INSERT INTO x(CHARTGUID, FOO, BAR) VALUES (1548089000170, 0, 'a');
DROP TABLE IF EXISTS y;
CREATE TABLE y (CHARTGUID NUMBER(16), FOO INT, BAR VARCHAR(1));
INSERT INTO y(CHARTGUID, FOO, BAR) VALUES (1410522400170, 0, 'a');
INSERT INTO y(CHARTGUID, FOO, BAR) VALUES (1548089000170, 1, 'a');
INSERT INTO y(CHARTGUID, FOO, BAR) VALUES (1548089000170, 1, 'b');
CREATE OR REPLACE FUNCTION TEST_C(CHART_GUID NUMBER, p VARCHAR)
RETURNS TABLE(A INT, B INT)
AS
$$
SELECT
CASE WHEN EXISTS (
SELECT x.CHARTGUID
FROM x
WHERE x.CHARTGUID = CHART_GUID
AND x.FOO = 1
AND x.BAR = p
) THEN 1 ELSE 0 END,
CASE WHEN EXISTS (
SELECT y.CHARTGUID
FROM y
WHERE y.CHARTGUID = CHART_GUID
AND y.FOO = 1
AND y.BAR = p
) THEN 1 ELSE 0 END,
$$
;
SELECT
v.CHARTGUID,
x.*
FROM v,
TABLE(TEST_C(v.CHARTGUID, 'a')) x
The true use case has many TABLE(TEST_C(v.CHARTGUID, 'a'))
, where 'a'
is a regex pattern and its presence in the two tables x
and y
needs to be checked for each pattern.
I want to avoid this sort of pattern:
SELECT
CASE WHEN EXISTS (
SELECT x.CHARTGUID
FROM x
WHERE x.CHARTGUID = v.CHARTGUID
AND x.BAR LIKE 'a%'
) THEN 1 ELSE 0 END AS "Ax",
CASE WHEN EXISTS (
SELECT y.CHARTGUID
FROM y
WHERE y.CHARTGUID = v.CHARTGUID
AND y.BAR LIKE 'a%'
) THEN 1 ELSE 0 END AS "Ay",
CASE WHEN EXISTS (
SELECT x.CHARTGUID
FROM x
WHERE x.CHARTGUID = v.CHARTGUID
AND x.BAR LIKE 'b%'
) THEN 1 ELSE 0 END AS "Bx",
CASE WHEN EXISTS (
SELECT y.CHARTGUID
FROM y
WHERE y.CHARTGUID = v.CHARTGUID
AND y.BAR LIKE 'b%'
) THEN 1 ELSE 0 END AS "By",
CASE WHEN EXISTS (
SELECT x.CHARTGUID
FROM x
WHERE x.CHARTGUID = v.CHARTGUID
AND x.BAR LIKE 'c%'
) THEN 1 ELSE 0 END AS "Cx",
CASE WHEN EXISTS (
SELECT y.CHARTGUID
FROM y
WHERE y.CHARTGUID = v.CHARTGUID
AND y.BAR LIKE 'c%'
) THEN 1 ELSE 0 END AS "Cy",
-- etc...
FROM v
What's the workaround for snowflakes 'subquery limitation'?
So the tables can be written differently, that I feel is cleaner as there are less characters, thus it's more readable, and they are atomic operations:
CREATE or replace TABLE v (CHARTGUID number)
as select * from values
(1410522400170),
(1548089000170);
CREATE or replace TABLE x (CHARTGUID number, FOO number, BAR text)
as select * from values
(1410522400170, 1, 'a'),
(1548089000170, 0, 'a');
CREATE or replace TABLE y (CHARTGUID number, FOO number, BAR text)
as select * from values
(1410522400170, 0, 'a'),
(1548089000170, 1, 'a'),
(1548089000170, 1, 'b');
I would be inclined to pre-process the data so it can be equi join base (left or normal) with CTE's
WITH xx as (
select distinct chartguid, bar
from x
where foo = 1
), yy as (
select distinct chartguid, bar
from y
where foo = 1
)
SELECT
v.chartguid,
NVL2(xx.chartguid, 1,0) as x1,
NVL2(yy.chartguid, 1,0) as x2
FROM v
LEFT JOIN xx
on xx.chartguid = v.chartguid
and xx.bar = 'a'
LEFT JOIN yy
on yy.chartguid = v.chartguid
and xx.bar = 'a'
This does not give you "one clever function to hide all the workings" which is generally a gross trick, but if you need clever, you could make a UDTF that joins those two+ CTE's together before join the main table, but this might do way too much work.
With the extended case I would more likely write it like:
WITH xx as (
select
chartguid,
count_if(bar LIKE 'a%') as "Ax",
count_if(bar LIKE 'b%') as "Bx",
count_if(bar LIKE 'c%') as "Cx",
from x
group by 1
), yy as (
select
chartguid,
count_if(bar LIKE 'a%') as "Ay",
count_if(bar LIKE 'b%') as "By",
count_if(bar LIKE 'c%') as "Cy",
from y
group by 1
)
SELECT
v.chartguid,
xx.* exclude(chartguid),
yy.* exclude(chartguid)
FROM v
LEFT JOIN xx
on xx.chartguid = v.chartguid
LEFT JOIN yy
on yy.chartguid = v.chartguid
Depending on how your tables are shaped, it might be faster to pre-join to avoid agregating unneed rows from x and y, or it might be way slower.