[SOLVED] Row Level security and partitioning in PowerBI

Row Level security and partitioning in PowerBI

I have a table FactUsers. with fields for Date, CompanyID, GrossCash and 20 other fields. and a measure named TotalGrossCash which is the sum of all the grosscash.

this table is partitioned by year-month (202401, etc) based on the Date field

i have row level security set up on CompanyID.
i have a dimdate table. this is linked to FactUsers based on Date field
I have a dimcompany table
i have a role - this has an RLS filter on dimcompany
Import mode (not directquery)

Q - when a user belonging to companyID 1 queries the table for April 2024 data, what exactly happens behind the scene?

does the engine have to look at the entire table to first filter for records belonging to companyID 1 (the RLS section) or does it immediately start looking at the Apr 2024 partition and then filter based on the RLS filter (companyid = 1)? Basically, is it capable of taking advantage of partitions by quickly going to the single partition even though RLS needs to be applied?

Solution

is it capable of taking advantage of partitions by quickly going to the single partition even though RLS needs to be applied?

First, terminology. In SSAS Tabular/Power BI, all rows are stored in a partition. A table has one or more partitions. Each partition has a separate source query.

So, sort of. The tabular engine (aka Vertipaq) uses compressed, columnar storage. In this storage format the partition is first broken up into segments, or "row groups" of 8M rows each.

Each of these segments is then broken apart into compressed columnar storage structures, and each segment has a header that indicates the min and max value for each column in that segment.

So while the storage engine does not maintain any partition-level metadata that indicates what values are in what partitions, it has segment-level metadata that it can use to skip whole segments in query processing, regardless of whether the table is partitioned.

The order in which your source query returns rows will be the "Z-order" of the rows grouped into segments. So if your table or partition query returns the rows in date order, then queries will be able to efficiently skip all the segments that don't contain rows matching a date filter. Similarly for the CompanyId.

So effectively you get to choose which access paths to optimize by controlling table partitioning and/or the row ordering within each partition.