snowflake-cloud-data-platformcost-management

Snowflake - how to track historical storage costs


For Snowflake, primarily, there are compute costs and storage costs (and then smaller miscellaneous items).

For compute costs -- snowflake has a the "warehouse metering" table that indicates cost per warehouse cluster running.

You can combine this with query history (and various sophisticated algorithms depending) to essentially track root cause of costs (and changes) over time.

I'm wondering if there's an analog for "storage" for this --- for the purposes of tracking storage stats over time.

I think snowflake has a detailed "current state" storage information in various places/ options.

I think in terms of historical, you can only see "total storage cost" over time, and that's it. So for instance if you want to track database or schema "size" over time ... there's no built-in table for this in snowflake -- you need to roll your own using snapshots effectively. Is that right, or am I missing a key table here?


Solution

  • Snowsight can show you the storage used per day for databases and stages - and if Snowsight can, then there is a query that does the job.

    Inspecting with Chrome, the query happens to be:

    select   usage_date
    ,   database_name as object_name
    ,   'DATABASE' as object_type
    ,   max(AVERAGE_DATABASE_BYTES) as database_bytes
    ,   max(AVERAGE_FAILSAFE_BYTES) as failsafe_bytes
    ,   0 as stage_bytes
    ,   max(database_id) as object_id
    ,   0 as hybrid_bytes 
    from SNOWFLAKE.ACCOUNT_USAGE.DATABASE_STORAGE_USAGE_HISTORY 
    where usage_date < date_trunc('DAY', ?::timestamp_ntz)   
    and usage_date >= date_trunc('DAY', ?::timestamp_ntz) 
    group by 1, 2, 3 
    union all 
    select   usage_date
    ,   'Stages' as object_name
    ,   'STAGE' as object_type
    ,   0 as database_bytes
    ,   0 as failsafe_bytes
    ,   max(AVERAGE_STAGE_BYTES) as stage_bytes
    ,   0 as object_id
    ,   0 as hybrid_bytes 
    from SNOWFLAKE.ACCOUNT_USAGE.STAGE_STORAGE_USAGE_HISTORY 
    where usage_date < date_trunc('DAY', ?::timestamp_ntz)   
    and usage_date >= date_trunc('DAY', ?::timestamp_ntz) 
    group by 1, 2, 3 ;
    

    Docs for this: