I have a table with line numbers and either a "define" or an "undefine" event of an identifier. Example:
line_no | def | undef
--------------------
1 | 'a' | NULL
2 | 'b' | NULL
3 | NULL| 'a'
...
42 | NULL| 'b'
I want to compute the "live variables" information on every line. Iteratively, I'd just write code like the following pseudocode:
live = []
for each r in table:
if r.def:
live.append(r.def)
else:
live.remove(r.undef)
store(r.line_no, live)
The expected result is a table like :
line_no | live
1. | ['a']
2. | ['a', 'b']
3. | ['b']
...
42. | []
I managed to write the equivalent sequential loop as a plpgsql function. However, I was wondering if there is a (maybe more elegant) way as a SQL query? I tried various things using lag() but somehow that never resulted in this "reduction" operation I am looking for?
(bonus points if the query can also partition over a field, such as 'filename')
If you want a pure SQL solution, use a recursive query.
with recursive cte as (
select line_no, array[def] as list
from my_table
where line_no = 1
union all
select
t.line_no,
case when def is null
then array_remove(list, undef)
else array_append(list, def)
end
from my_table t
join cte c on c.line_no = t.line_no- 1
)
select *
from cte;
However, a more efficient and flexible solution in this case may be creating a function.
create or replace function list_of_vars()
returns table (line_no int, list text[])
language plpgsql as $$
declare
arr text[];
rec record;
begin
for rec in
select *
from my_table
order by line_no
loop
if rec.def is null then
arr := array_remove(arr, rec.undef);
else
arr := array_append(arr, rec.def);
end if;
line_no := rec.line_no;
list := arr;
return next;
end loop;
end
$$;
Use:
select *
from list_of_vars()
order by line_no
Test it in db<>fiddle.