oracle-databasefunctionplsqlmerge

PL SQL: inner Function within Merge is called more often than expected


I got a setup where I would like to call a function once per group and apply the result to multiple rows (multiple eans). However, the function is called as many times as I have eans. This might be some internal statement rewriting, but I would like to avoid this. Can you help out?

drop table dummy_data;
drop table dummy_log;
drop function write_dummy_log;

create table dummy_data(ean VARCHAR2(10), my_group VARCHAR2(10), ean_list VARCHAR2(1000), return_val int);
create table dummy_log(log_entry VARCHAR2(1000));

insert into dummy_data (ean, my_group) values ('ean11', 'groupA');
insert into dummy_data (ean, my_group) values ('ean22', 'groupA');
commit ;
select *  from dummy_data;
select *  from dummy_log;



create function write_dummy_log(p_ean_list VARCHAR2, p_group VARCHAR2) return int
as
pragma autonomous_transaction;
begin
    insert into dummy_log (log_entry) values ( 'I use ' || p_ean_list || ' and ' || p_group);
    commit ;
    return 0;
end;

merge into dummy_data t1
using ( select /*+ no_merge */ my_group
                      , ean_list
                      , write_dummy_log(ean_list, my_group) as ret_val
                   from ( select my_group
                               , listagg(ean, ',') as ean_list
                            from dummy_data
                           where 1 = 1
                           group by my_group
               )
      ) t2
on ( t1.my_group = t2.my_group )
when matched then
    update
       set
           t1.return_val = t2.ret_val
         , t1.ean_list   = t2.ean_list
     where 1 = 1;

commit ;
select *  from dummy_data; -- will show expected values anyways
select *  from dummy_log;  -- will show expected values because of /*+ no_merge */ hint

Running the above will end up in two entries within the logging table dummy_log. I would like to have only one entry per group. Can you explain why this happens? How can I rewrite this to call the function only once per group? Solution: with no_merge hint this is okay.

Best, Peter

Edit 4/22/2025: added an executable minimal example. Added the solution within the code.


Solution

  • It's unclear what your issue is, because you haven't told us what data is in the table. Your SQL:

    merge into table1 t1
    using 
    ( select * from
        (select group, my_func(ean_list) as result from
            ( Select group, listagg(ean,',') as ean_list from table1 group by group )
        )
     ) t2
    on (t1.group = t2.group)
    when matched then update set t1.res = t2.result;
    

    does a group by group, so if you have more than one group value (of course you would, otherwise you wouldn't be aggregating at all), then you are going to end up with multiple rows feeding into the parent block, and you can expect my_func to be called as many times.

    If your issue is that ean_list is the same for multiple rows and you want to call the function only once per unique argument, then you need to declare your function with the deterministic option:

    create or replace function my_func(ean IN varchar2) return varchar2 deterministic as...

    Lastly, SQL is a declarative syntax, not a programming language. We tell the database what we want, and it figures out on its own exactly how to go about getting it. That means it is free to reorder, rewrite, merge, split up, etc... the pieces of our query as it pleases to accomplish the task in what it believes to be the optimal manner. Hence, SQL gives us no promises about when a function (for example) is going to be called, and that means we cannot control exactly how many times it's called, either.

    But, there are ways to force it with careful manipulation. The main concern here is "view merging," where Oracle will collapse nested query blocks into a single block. It often postpones aggregation until later than where we've declared it, and that can cause your function to fire more than it needs to. If Oracle is merging your using clause with the top merge block, for example, that may result in the function being called for every row in table1 (after the join) rather than once for each row in t2 (before the join). You can check this by reading your query's explain plan. If view merging is your issue, you can prevent that with a /*+ no_merge */ hint or adding a dummy column like rownum that forces an immediate sort and disallows view merging:

    merge into table1 t1
    using 
    ( select /*+ no_merge */ * from
        (select group, my_func(ean_list) as result from
            ( Select group, listagg(ean,',') as ean_list from table1 group by group )
        )
     ) t2
    on (t1.group = t2.group)
    when matched then update set t1.res = t2.result;
    

    The same thing can happen with nested blocks (query blocks beneath your function), but in your case, since you are using the result of listagg (which can only be computed during aggregation) from the child block in your function call, you can be assured that the group by will happen prior to the function call without needing to add further hints. So if there is view merging happening that's multiplying calls to the function, it'd have to be a merge with the parent.