My task:
The variables x
and def
are a counter and a boolean respectively. Both start at zero.
x > 2
then def = 1
from that point onward.def = 1
then in order for it to become zero again, x
has to equal zero for 3 consecutive periods otherwise def
remains 1
.I have a background in sas and am now starting to work more extensively with Postgresql:
data have;
input n x;
datalines;
1 0
2 1
3 2
4 3
5 4
6 2
7 1
8 0
9 1
10 0
11 0
12 0
13 0
14 1
15 2
16 3
17 4
;
run;
data want(drop=prev_: cur_m def rename=def_new=def);
set have;
retain prev_def prev_cur_m;
if x > 2 then def = 1;
else if prev_def = 1 and x > 0 then def = 1;
else def = 0;
if _n_ = 1 then cur_m = -1;
else if def = 1 then cur_m = -1;
else if prev_def = 1 and def = 0 then cur_m = 0;
else if -1 < prev_cur_m < 3 and x > 0 then cur_m = 0;
else if -1 < prev_cur_m < 3 and x = 0 then cur_m + 1;
else cur_m = -1;
if _n_ = 1 then def_new = 0;
else if def = 1 then def_new = 1;
else if -1 < cur_m < 3 then def_new = 1;
else def_new = 0;
output;
prev_def = def;
prev_cur_m = cur_m;
run;
As far as I understood this can be achieved with cursors and attempted to solve the problem using this post. However, I got stuck on the step when I needed to use the values from the previous row in order to calculate the current ones. Here is the code:
create table have (
n int,
x int
) ;
insert into have (n,x)
values (1,0),(2,1),(3,2),(4,3),(5,4),(6,2),
(7,1),(8,0),(9,1),(10,0),(11,0),(12,0),
(13,0),(14,1),(15,2),(16,3),(17,4);
create or replace function calc_dpd()
returns table (
n int,
x int,
def int,
cure_m int,
def_new int )
language plpgsql as $f$
declare rec record;
begin
for rec in (select * from have order by n) loop
if rec.n = 1 then
def = 0;
cure_m = 0;
def_new = 0;
end if;
if x > 2 then
def = 1;
end if;
-- here I need to test if def from previous row = 1 and current def = 0;
select rec.n, rec.x
into n,x;
return next;
end loop;
end $f$;
PL/pgSQL can do it but you don't need its complexity and overhead to achieve what you need: it's all doable with window functions in plain, declarative SQL: demo
0=all(array[])
checks if all elements in the array are 0
lag(x,that_many_rows_back,0)
gets the value x
from that many rows back. Or 0
if there aren't that many rows prior to this one.every()
or bool_and()
as a window function, and combine that with a frame start clause so that the window frame only reaches 2 rows back.over()
clause, or separately in window
clause. The latter is preferable if it's non-trivial or you need to re-use it.bool_or()
checks if any of the values up to this one met the criteria. Here it's limited to a window frame, so it only concerns the rows since the last time def
switched off after getting 3 zeroes in x
.boolean::int
cast maps false
to 0
, true
to 1
.with cte as (
select *
,(x > 2) exceeded_2
,(every(x=0)over w1) as should_switch
from have
window w1 as (order by n rows 2 preceding) )
,cte2 as (
select *
,sum(should_switch::int)over(order by n) def_on_period
from cte
)
select n,x,(bool_or(exceeded_2) over w2)::int as def
from cte2
window w2 as (partition by def_on_period
order by n);
n | x | def |
---|---|---|
1 | 0 | 0 |
2 | 1 | 0 |
3 | 2 | 0 |
4 | 3 | 1 first x>2 |
5 | 4 | 1 |
6 | 2 | 1 |
7 | 1 | 1 |
8 | 0 | 1 |
9 | 1 | 1 |
10 | 0 | 1 |
11 | 0 | 1 |
12 | 0 | 0 third zero in x |
13 | 0 | 0 |
14 | 1 | 0 |
15 | 2 | 0 |
16 | 3 | 1 x>2 |
17 | 4 | 1 |