sqlsql-servercolumnstore

Retrieve the correct disk space usage of a columnstore index in MSSQL


I was trying to retrieve the exact amount of bytes required by a clustered columnstore index in MSSQL. Unfortunately I get different result from sp_spaceused compared to a simple query on sys.allocation_units.

Here the simple script:

-- create table
CREATE TABLE TestColumnStore (Id INT IDENTITY(1,1) NOT NULL, Val VARCHAR(128));
CREATE CLUSTERED COLUMNSTORE INDEX [IX_TestColumnStore] ON [TestColumnStore];
GO
-- append 1000 rows
DECLARE @i INT = 0;
WHILE @i < 1000
    BEGIN
    INSERT INTO dbo.TestColumnStore (Val) VALUES ('Jhon');
    SET @i = @i+1;
    END

Here the sp_spaceused result:

name             rows       reserved    data          index_size         unused
---------------- ---------- ----------- ------------- ------------------ ------------------
TestColumnStore  1000       144 KB      48 KB         16 KB              80 KB

Then I try to get the amount of occupied pages querying directly sys.allocation_units using this query:

SELECT 
    al.total_pages * 8                       AS TotalSpaceKB,
    al.used_pages * 8                        AS UsedSpaceKB,
    (al.total_pages - al.used_pages) * 8 AS UnusedSpaceKB,
    al.allocation_unit_id,
    p.partition_id,
    o.[name],
    al.type_desc,
    p.data_compression_desc
FROM
    sys.allocation_units al
    JOIN sys.partitions p ON al.container_id = p.partition_id
    LEFT JOIN sys.objects o ON o.object_id = p.object_id
WHERE
    o.name = 'TestColumnStore'

And of course the result is apparently exactly the half returned by sp_spaceused.

TotalSpaceKB         UsedSpaceKB          UnusedSpaceKB        allocation_unit_id   partition_id         name                  type_desc          data_compression_desc
-------------------- -------------------- -------------------- -------------------- -------------------- --------------------- ------------------ ----------------------
0                    0                    0                    72057594050183168    72057594043629568    TestColumnStore       IN_ROW_DATA        COLUMNSTORE
72                   16                   56                   72057594050248704    72057594043629568    TestColumnStore       LOB_DATA           COLUMNSTORE

Can anyone help me to figure out what am I missing? Which is the correct way to get the exact table size in bytes?


Solution

  • This is the final query that returns the exact same size as the sp_spaceused do:

    SELECT 
        SUM(TotalSpaceKB) TotalSpaceKB,
        SUM(UsedSpaceKB) UsedSpaceKB,
        SUM(UnusedSpaceKB) UnusedSpaceKB
    FROM (
        SELECT 
            al.total_pages * 8                       AS TotalSpaceKB,
            al.used_pages * 8                        AS UsedSpaceKB,
            (al.total_pages - al.used_pages) * 8 AS UnusedSpaceKB,
            al.allocation_unit_id,
            p.partition_id,
            o.[name],
            al.type_desc,
            p.data_compression_desc
        FROM
            sys.allocation_units al
            JOIN sys.partitions p ON al.container_id = p.partition_id
            JOIN sys.objects o ON o.object_id = p.object_id
        WHERE
            o.name = 'TestColumnStore'
        UNION
        SELECT 
            al.total_pages * 8                       AS TotalSpaceKB,
            al.used_pages * 8                        AS UsedSpaceKB,
            (al.total_pages - al.used_pages) * 8 AS UnusedSpaceKB,
            al.allocation_unit_id,
            NULL AS partition_id,
            o.[name],
            al.type_desc,
            NULL AS data_compression_desc
        FROM
            sys.column_store_row_groups csrg 
            JOIN sys.allocation_units al ON al.container_id = csrg.delta_store_hobt_id
            JOIN sys.objects o ON csrg.object_id = o.object_id
        WHERE
            o.name = 'TestColumnStore'
    ) a
    GROUP BY a.name