postgresqlplpgsql

why functions that returns tables are so much slower then running the actual query?


I'm pretty new to PostgreSQL so I guess i'm missing some basic information, information that I didn't quite find while googling, guess I didn't really know the right keywords, hopefully here I'll get the missing information :)

I'm using PostgreSQL 11.4.

I've encountered many issues when I create a function that returns a query result as a table, and it executes it about 50 times slower then running the actual query, sometimes even more then that.

I understand that IMMUTABLE can be used when there is no table scans, just when I manipulate and return data based on the function parameters and STABLE when if the query with same parameters do a table scan and always returns the same results.

so the format of my function creation is this:

CREATE FUNCTION fnc_name(columns...)
RETURNS TABLE ( columns..) STABLE AS $func$
BEGIN
 select ...
END $func$ LANGUAGE pgplsql;

I can't show the query here since it's work related, but still... there is something that I didn't quite understand about creating functions why is it so slow ? I need to fully understand this issue cause I need to create many more functions and it seems right now that I need to run the actual query to get proper performance instead of using functions and I still don't really have a clue as to why!

any information regarding this issue would be greatly appreciated.


Solution

  • All depends on usage of this function, and size of returned relation.

    First I have to say - don't write these functions. It is known antipattern. I'll try to explain why. Use views instead.

    1. Result of table functions written in higher PL languages like Perl, Python or PLpgSQL is materialized. When table is small (to work_mem) it is stored in memory. Bigger tables are stored in temp file. It can have significant overhead.

    2. Function is a black box for optimizer - is not possible to push down predicates, there are not correct statistics, there is not possible to play with form of joins or order of joins. So some not trivial queries can be slower (little bit or significantly) due impossible optimizations.

    There is a exception from these rules - simple SQL functions. SQL functions (functions with single SQL statement) can be inlined (when some prerequisites are true). Due inlining the body of function is merged to body of outer SQL query, and the result is same like you will write subquery directly. So result is not materialized and it is not a barrier for optimization.

    There is a basic rule - use functions only when you cannot to calculate some data by SQL. Don't try to hide SQL or encapsulate SQL (elsewhere - for simplification some complex queries use views not functions). Same rules are valid for all SQL databases (Oracle, DB2, MSSQL). Postgres is not a exception.

    This note is not against stored procedures (functions). It is great technology. But it requires specific style of programming. Wrapping queries into functions (when there is not any other) is bad.