mysqlsqldeprecation-warninguser-variables

MySQL 8 - user variables within expressions is deprecated (complex calculation)


// EDIT //

Below is the original question, but my problem can be really simplified with one question... how can I get the below output without using user variables?!

SELECT ID, @cumu_return:=IF(id = 1, 1, @cumu_return + (@cumu_return * ret)) AS cumulative_return 
FROM (
    SELECT 1 AS ID, 1 AS num, 0.1 AS ret UNION ALL
    SELECT 2 AS ID, 1 AS num, 0.1 AS ret UNION ALL
    SELECT 3 AS ID, 1 AS num, 0.1 AS ret UNION ALL
    SELECT 4 AS ID, 1 AS num, 0.1 AS ret UNION ALL
    SELECT 5 AS ID, 1 AS num, 0.1 AS ret
) t

// END EDIT //

I have a table as follows...

CREATE TABLE `daily_return` (
  `id` int(11) NOT NULL,
  `list_id` int(11) NOT NULL,
  `last_updated` datetime NOT NULL DEFAULT '2000-01-01 00:00:00',
  `daily_return` float NOT NULL,
  `last_return` float NOT NULL, KEY (`id`)
)

'daily return' is a percentage change in the price for a list_id, calculated every day and the last_return is the first/starting return number.

I need to calculate a cumulative return metric using the following logic:

'Previous Day Return' + ('Previous Day Return' * 'Daily Return')

Hence, I have the following query...

INSERT INTO cmc_cumulative_return (list_id, last_updated, cumulative_return)
SELECT list_id, last_updated, cumulative_return FROM (
    SELECT id, list_id, last_updated, daily_return, last_return,
        @cumu_return:=IF(id = 1, last_return, @cumu_return + (@cumu_return * daily_return)) AS cumulative_return 
    FROM daily_return c
) t WHERE id <> 1;

When I run this inside a procedure, it throws a warning:

Setting user variables within expressions is deprecated and will be removed in a future release. Please set variables in separate statements instead.

I have seen some other similar questions on stackoverflow, but they were all simple increment calculations that can be replaced by ROW_NUMBER() OVER or SUM() OVER, but I am unable to figure out how to remove variables in my query above.

Originally, I was using 3-4 variables, but now I have got it down to one and request your help removing the last one.

EDIT 2

Query with actual data: (Gordon's calculation included)

SELECT id, list_id, last_updated, daily_return, last_return,
    @cumu_return:=IF(id = 1, last_return, @cumu_return + (@cumu_return * daily_return)) AS cumulative_return,
    (
        EXP(SUM(LN(1 + daily_return)) OVER (ORDER BY id)) / (1 + daily_return)
    ) as cumulative_return2
FROM (
    SELECT 1 AS id, 2 AS list_id, '2019-02-20' AS last_updated, 0 AS daily_return, 1.15 AS last_return UNION ALL
    SELECT 2 AS id, 2 AS list_id, '2019-02-21' AS last_updated, 0.0145999858 AS daily_return, 1.15 AS last_return UNION ALL
    SELECT 3 AS id, 2 AS list_id, '2019-02-22' AS last_updated, -0.0503679203 AS daily_return, 1.15 AS last_return UNION ALL
    SELECT 4 AS id, 2 AS list_id, '2019-02-23' AS last_updated, 0.0111594238 AS daily_return, 1.15 AS last_return
) t

Solution

  • You can use natural logs and exponentiation for the cumulative product calculation.

    INSERT INTO cmc_cumulative_return (list_id, last_updated, cumulative_return)
        SELECT list_id, last_updated, 
               (MAX(CASE WHEN id = 1 THEN last_return END) OVER () *
                EXP(SUM(LN(1 + daily_return)) OVER (ORDER BY id)
                   ) / (1 + daily_return)
               ) as cumulative_return
        FROM daily_return c;
    

    If you really want to exclude where id = 1, then that requires an additional level of subqueries.

    Here is a db<>fiddle.