sql google-bigquery common-table-expression

SQL recursive CTEs to perm array

I'm having some trouble formulating the correct recursive behaviour to get what I want in SQL. I'm limited to the BigQuery environment and I want to avoid using any JavaScript if I can, so I wanted to try and recreate this behaviour just using CTEs if possible. I can do exactly what I want in Python, but have been unable to do it in SQL. Yes I know SQL is not really designed for this, but I do have a reason for it so would like to try and get it to work. I want to be able to provide any length array and return a list of permutations. For example:

INPUT [1, 1, 1] OUTPUT [0, 0, 0]
INPUT [2, 1, 1] OUTPUT [[0, 0, 0], [1, 0, 0]]
INPUT [2, 1, 2] OUTPUT [[0, 0, 0], [1, 0, 0], [0, 0, 1], [1, 0, 1]]
INPUT [1, 5] OUTPUT [[0,0], [0,1], [0,2], [0,3], [0,4]]

I don't mind how the inputs and outputs are formatted as long as I can parse them one way or another. E.g. returning rows of strings are fine, each element in a separate row is fine as I can use array_agg, etc.

In Python doing this recursively or using for-loops is pretty trivial. I start with a list of zeroes and work my way along each index, incrementing the value at that index until it reaches input value-1 and store each permutation before moving on.

In SQL this is what I have so far, which is only really the first step:

WITH RECURSIVE
  OrigSeq AS (SELECT [2, 1, 2] as some_numbers), -- input sequence
  BaseSeq AS (SELECT ARRAY(SELECT 0 as a FROM UNNEST(OrigSeq.some_numbers)) AS base FROM OrigSeq), -- get array of 0s of same size as input sequence
  Sequences AS (
    (SELECT 0 AS perm_id, 0 as idx, BaseSeq.base[0] as iter, OrigSeq.some_numbers[0]-1 AS orig, BaseSeq.base as base, OrigSeq.some_numbers as num FROM OrigSeq, BaseSeq)
    UNION ALL
    -- increment index
    SELECT perm_id, idx+1 as idx, base[idx+1] as iter, num[idx+1]-1 as orig, base, num FROM Sequences WHERE idx < ARRAY_LENGTH(num)-1
  ),
  Seq2 AS (
    SELECT * FROM Sequences
    UNION ALL
    -- parse iters - not doing this quite right and my brain stops functioning around here
    SELECT perm_id+1, idx, base[idx]+1 as iter, orig, base, num FROM Sequences where iter <= orig
  ),
  Seq3 AS (
    Select * From Seq2
    UNION ALL
    SELECT perm_id, idx, iter, orig, base, num from Sequences
  )
SELECT perm_id, idx, orig, iter
FROM Seq3
ORDER BY perm_id, idx

Any guidance appreciated.

Solution

Recursivity to append

A first recursive loop can compute all possible values at each position,
then a second recursive one will append to an array starting from the empty array.

WITH RECURSIVE
  OrigSeq AS (SELECT ARRAY[2, 1, 2] AS a),
  -- List the indices of our array (which will give us the number of cells too):
  Pos AS (SELECT n, pos FROM OrigSeq, UNNEST(a) AS n WITH OFFSET AS pos),
  -- For each position list its possible values:
  Poss AS
  (
    SELECT n - 1 AS n, Pos.pos FROM Pos -- Per specification, "2" should give a range from 0 to 1 (2 is the count, not the max), so start one step beyond.
    UNION ALL
    SELECT n - 1, pos FROM Poss WHERE n > 0
  ),
  -- Start from an empty array, generate as many arrays of 1 element as there are possibilities for cell 1, and so on:
  Combi AS
  (
    SELECT [] a, 0 next
    UNION ALL
    SELECT ARRAY_CONCAT(a, [ n ]), next + 1
    FROM Combi, Poss
    WHERE Poss.pos = next
  )
-- Keep only the completed arrays:
SELECT * from Combi WHERE next = (SELECT COUNT(1) FROM Pos);

a	lvl
{1,0,1}	3
{0,0,1}	3
{1,0,0}	3
{0,0,0}	3

See the PostgreSQL equivalent in a fiddle:

Recursivity to decrement

There is a way to first generate all positions as posToDecrement,
then recursively starting from the initial numbers array as a, joining unnest(a) to posToDecrement to generate the different combinations, and array_agg(a.value) over (order by pos rows between unbounded preceding and unbounded following), discarding rows where any value falls under 0 (min() < 0).

You'll find a fiddle proof-of-concept for PostgreSQL.

However my intuition tells there is room for improvement:

given a start array of 3 elements, on each iteration I'd like to generate 3 triplets per previous triplet (thus 1 triplet on first iteration, 3 on the second one, 9 on the third, etc.) where each new triplet is the previous, minus 1 on 1 of its fields (thus if iteration 0 was [3,2,4], iteration 1 should produce [2,2,4],[3,1,4],[3,2,3]); relying on a UNION non-ALL to discard duplicates (at iteration 2 two paths lead to [2,1,4]: [2,2-1,4] and [3-1,1,4]).
as UNNEST generates 3 rows per triplet, and a windowed ARRAY_AGG can generate a new array of 3 elements for each of those rows, there should be no need to join anything else than the UNNEST peudo-table.
however I'm stuck trying to ARRAY_AGG(prev.n - CASE WHEN v.pos = CURRENT_ROW THEN 1 ELSE 0 END) FROM UNNEST(…) v
(which already made questions on SO)
thus for now I have only the suboptimal DISTINCT after having joined Pos (listing all possible indices)
another way could be to ARRAY_CONCAT(ARRAY_AGG(v.n) OVER (… TO 1 PRECEDING), v.n - 1, ARRAY_AGG(v.n) OVER (… FROM 1 FOLLOWING))