I'am looking to increment the current calculated value as a field in Athena AWS, that is possible in Qlik with peek() function, the goal is to create and increment the state field in each investor or roject activationp state, here is my DATA:
PROJECT | INVESTOR | DATE | IS_ACTIF |
---|---|---|---|
P1 | I1 | 010124 | 1 |
P1 | I1 | 020124 | 1 |
P1 | I1 | 030124 | 0 |
P1 | I1 | 040124 | 1 |
P1 | I2 | 050124 | 1 |
P1 | I2 | 060124 | 0 |
Here is the desired result:
PROJECT | INVESTOR | DATE | IS_ACTIF | STATE |
---|---|---|---|---|
P1 | I1 | 010124 | 1 | 1 |
P1 | I1 | 020124 | 1 | 1 |
P1 | I1 | 030124 | 0 | 2 |
P1 | I1 | 030124 | 1 | 3 |
P1 | I2 | 050124 | 1 | 1 |
P1 | I2 | 060124 | 0 | 2 |
How cana'I achieve this with Athena SQL please?
WITH QLIK:
STATE:
NoConcatenate
LOAD DISTINCT
PROJECT,
INVESTOR,
DATE,
IS_ACTIF,
IF(RowNo()=1 OR Previous(PROJECT&INVESTOR)<>PROJECT&INVESTOR,1,
IF( IS_ACTIF=Previous(IS_ACTIF)
,peek(STATE),peek(STATE)+1) ) AS STATE
RESIDENT DATA ORDER BY PROJECT, INVESTOR,DATE;
It's not an actual alternative to peek()
from Qlik, but it can be done with a combination of window-functions in Athena:
SELECT PROJECT,
INVESTOR,
"DATE",
IS_ACTIF,
SUM(ABS(is_actif-lagged_actif)) OVER (PARTITION BY PROJECT, INVESTOR ORDER BY row_id) AS "STATE"
FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY PROJECT, INVESTOR) as row_id, LAG(is_actif,1, 0) OVER(PARTITION BY PROJECT, INVESTOR) as lagged_actif
FROM TEST_SO
) a
Basically it does a cumulative sum on the absolute difference between each state and it's previous state.