postgresqlmulti-tenantrow-level-security

Is RLS with policy in PostgreSQL using index on the filter


I have noticed that implicitly specifying WHERE tenant_id = ? clause drastically improves performance over USING (tenant_id = get_tenant_id()) and I don't really understand why.

I have two tenants - 1 and 2.

Tenant 1 has around 100000 rows.

Tenant 2 has around 6000 rows.

SET app.tenant_id = 2; SET ROLE user;
EXPLAIN ANALYZE SELECT count(*) FROM people;
RESET ROLE; RESET app.tenant_id;

Aggregate  (cost=32570.08..32570.09 rows=1 width=8) (actual time=62.855..62.856 rows=1 loops=1)
  ->  Index Only Scan using index_people_on_tenant_id on people  (cost=0.29..32429.13 rows=56381 width=0) (actual time=58.639..62.642 rows=6000 loops=1)
        Filter: (tenant_id = get_tenant_id())
        Rows Removed by Filter: 101125
        Heap Fetches: 57059
Planning Time: 1.979 ms
Execution Time: 65.704 ms

As you can see, the filter is being applied and it is correctly removing 101125 rows. But the execution time is super slow.

Now, when executing a statement with an implicit where clause.

SET app.tenant_id = 2; SET ROLE user;
EXPLAIN ANALYZE SELECT count(*) FROM people where tenant_id = 2;
RESET ROLE; RESET app.tenant_id;

Aggregate  (cost=1818.55..1818.56 rows=1 width=8) (actual time=13.079..13.080 rows=1 loops=1)
  ->  Index Only Scan using index_people_on_tenant_id on people  (cost=0.29..1810.77 rows=3112 width=0) (actual time=0.467..12.348 rows=6000 loops=1)
        Index Cond: (tenant_id = 2)
        Filter: (tenant_id = get_tenant_id())
        Heap Fetches: 11826
Planning Time: 1.867 ms
Execution Time: 15.641 ms

Why is the filter not using index for the filtering?

The policy definition

ALTER TABLE people_policy ENABLE ROW LEVEL SECURITY;

CREATE POLICY people_policy ON people FOR ALL TO user USING (tenant_id = get_tenant_id());
CREATE OR REPLACE FUNCTION get_tenant_id() RETURNS BIGINT AS $$
BEGIN
  RETURN NULLIF(current_setting('app.tenant_id', TRUE), '')::BIGINT;
END;
$$ LANGUAGE plpgsql;

Solution

  • Include STABLE in the function's declaration to inform PostgreSQL that its value will not change during the execution of a single statement, e.g.:

    CREATE OR REPLACE FUNCTION get_tenant_id() RETURNS BIGINT STABLE AS $$
    BEGIN
      RETURN NULLIF(current_setting('app.tenant_id', TRUE), '')::BIGINT;
    END;
    $$ LANGUAGE plpgsql;
    

    Using STABLE allows PostgreSQL to evaluate the function once per statement instead of calling it for each row. This can have a significant impact on performance when querying a large number of rows.