sql-serverblobindexed-view

How is BLOB stored in an indexed view?


The Question

Assuming I make an indexed view on a table containing a varbinary(max) column, will the binary content be physically copied into the indexed view's B-Tree, or will the original fields just be "referenced" somehow, without physically duplicating their content?

In other words, if I make an indexed view on a table containing BLOBs, will that duplicate the storage needed for BLOBs?

More Details

When using a full-text index on binary data, such as varbinary(max), we need an additional "filter type" column to specify how to extract text from that binary data so it can be indexed, something like this:

CREATE FULLTEXT INDEX ON <table or indexed view> (
    <data column> TYPE COLUMN <type column>
)
...

In my particular case, these fields are in different tables, and I'm trying to use indexed view to join them together, so they can be used in a full-text index.

Sure, I could copy the type field into the BLOB table and maintain it manually (keeping it synchronized with the original), but I'm wondering if I can make the DBMS do it for me automatically, which would be preferable unless there is a steep price to pay in terms of storage.

Also, merging these two tables into one would have negative consequences of its own, not to go into too much details here...


Solution

  • will that duplicate the storage needed for BLOBs?

    Yes. The indexed view will have its own copy.

    You can see this from

    CREATE TABLE dbo.T1
      (
         ID   INT IDENTITY PRIMARY KEY,
         Blob VARBINARY(MAX)
      );
    
    DECLARE @vb VARBINARY(MAX) = CAST(REPLICATE(CAST('ABC' AS VARCHAR(MAX)), 1000000) AS VARBINARY(MAX));
    
    INSERT INTO dbo.T1
    VALUES      (@vb),
                (@vb),
                (@vb);
    
    GO
    
    CREATE VIEW dbo.V1
    WITH SCHEMABINDING
    AS
      SELECT ID,
             Blob
      FROM   dbo.T1
    
    GO
    
    CREATE UNIQUE CLUSTERED INDEX IX
      ON dbo.V1(ID)
    
    SELECT o.NAME       AS object_name,
           p.index_id,
           au.type_desc AS allocation_type,
           au.data_pages,
           partition_number,
           au.total_pages,
           au.used_pages
    FROM   sys.allocation_units AS au
           JOIN sys.partitions AS p
             ON au.container_id = p.partition_id
           JOIN sys.objects AS o
             ON p.object_id = o.object_id
    WHERE  o.object_id IN ( OBJECT_ID('dbo.V1'), OBJECT_ID('dbo.T1') ) 
    

    Which returns

    +-------------+----------+-----------------+------------+------------------+-------------+------------+
    | object_name | index_id | allocation_type | data_pages | partition_number | total_pages | used_pages |
    +-------------+----------+-----------------+------------+------------------+-------------+------------+
    | T1          |        1 | IN_ROW_DATA     |          1 |                1 |           2 |          2 |
    | T1          |        1 | LOB_DATA        |          0 |                1 |        1129 |       1124 |
    | V1          |        1 | IN_ROW_DATA     |          1 |                1 |           2 |          2 |
    | V1          |        1 | LOB_DATA        |          0 |                1 |        1129 |       1124 |
    +-------------+----------+-----------------+------------+------------------+-------------+------------+