sqlsql-serverdatabaseindexingcolumnstore

What are 'CLOSED state' and 'delta rowgroup'


In the guide to creating columnstore indexes there is a paragraph about the COMPRESSION_DELAY option.

COMPRESSION_DELAY = 0 | delay [ Minutes ]

For a disk-based table, delay specifies the minimum number of minutes a delta rowgroup in the CLOSED state must remain in the delta rowgroup before SQL Server can compress it into the compressed rowgroup. Since disk-based tables don't track insert and update times on individual rows, SQL Server applies the delay to delta rowgroups in the CLOSED state. The default is 0 minutes.

I don't understand what are the CLOSED state, rowgroup and delta rowgroup mean exactly. Could someone explain this to me?


Solution

  • SQL Server's columnstore format is horizontally organized into groups of rows (rowgroups). Data can be in COMPRESSED form or in BTree-s.

    COMPRESSED rowgroups are immutable and additional structures are used to keep track of the updates (delete bitmap, etc.). Compressed rowgroups take less space and aggregations can be ran on them faster, this way your throughput is drastically increased. The compressed form is fast only above a certain number of rows, otherwise you're better off in rowstore format - aka. delta stores. Once this threshold is crossed these rows are migrated into the compressed form in the background.

    CLOSED state means that this rowgroup reached the threshold to be compressed and it's waiting for compression. No new rows are added to this rowgroup, deletes however can remove rows from it. If the rowcount falls below a min bar the rowgroup can be OPEN-ed again, so it accepts new incoming rows.

    For more details take a look at the official documentation, at Niko's blog or at Remus' blog.