My query returns many (thousands of) rows.
Column l
has certain value for very small amount of rows (up to 10).
For each such row I want to output aggregated comma-separated values of very short (up to 5 chars) varchar column v
over all of these rows.
For rows not having the special value of l
I want to simply output the v
value for that row.
Synthetized example of same problem: from first 10000 integers, I want to output 1,2,3,4,5,6,7,8,9
for each single-digit number; that number for multiple-digit number. (Yes, silly example but real case makes sense.)
with x (v,l) as (
select to_char(level), length(to_char(level)) from dual connect by level <= 10000
)
select case l
when 1 then listagg(v,',') within group (order by v) over (partition by l)
else v
end
from x
order by 1;
The problem is, listagg
function fails on ORA-01489: result of string concatenation is too long
error.
I am aware of 4000 char limit of listagg
function as well as xmlagg-based workaround. I just don't get the limit is enough for data I want to concatenate even though not enough for all data. In example above, the partition of 9 single-digit numbers fits into 4000 chars, the partition of 9000 four-digit numbers not. I expected the case
expression would prevent execution of window for unrelated rows but, for some reason, it seems the db engine evaluates window for all rows. (Also note that order by
clause causes query to fail-fast - without it some rows are returned before failure.)
Can you please explain some reasoning for this behaviour? I suspect the window computation is logically before select
clause but without any evidence. Reproduced on Oracle 11g, 18c and 19 (livesql).
Well you are using SQL
which is not procedural, so you can't expect that some parts of the code path will not be executed, only because they are not used. (So filling a bug as other suggested will have no success).
Anyway you can do the often used trick based on the fact that listagg
ignores null
values.
So this formulation works fine:
with x (v,l) as (
select to_char(level), length(to_char(level)) from dual connect by level <= 10000
)
select nvl(listagg(case when l = 1 then v end,',') within group (order by v) over (partition by l),v) lst
from x
order by 1;
giving
LST
------------------
1,2,3,4,5,6,7,8,9
1,2,3,4,5,6,7,8,9
..
10
100
1000
10000
The explanation of the problem can be found in the execution plan (showing only the relevant part)
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 35 | 4 (50)| 00:00:01 |
| 1 | SORT ORDER BY | | 1 | 35 | 4 (50)| 00:00:01 |
| 2 | WINDOW SORT | | 1 | 35 | 4 (50)| 00:00:01 |
| 3 | VIEW | | 1 | 35 | 2 (0)| 00:00:01 |
|* 4 | CONNECT BY WITHOUT FILTERING| | | | | |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
...
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) CASE "L" WHEN 1 THEN LISTAGG("V",',') WITHIN GROUP ( ORDER BY
"V") OVER ( PARTITION BY "L") ELSE "V" END [4000]
2 - (#keys=2) "L"[NUMBER,22], "V"[VARCHAR2,40], LISTAGG("V",',') WITHIN
GROUP ( ORDER BY "V") OVER ( PARTITION BY "L")[4000]
3 - "V"[VARCHAR2,40], "L"[NUMBER,22]
4 - LEVEL[4]
So in the line 2 the listagg
is calculated (for all rows) only to be filtered in the line 1.