sqlsnowflake-cloud-data-platformsnowflake-schema

Restore the data from the table recreated multiple times in snowflake


I had an issue when someone from my team recreated the table more than once using CREATE OR REPLACE as there were some new fields that had to be added. However, instead of adding a column using ALTER TABLE, the entire table was recreated.

I tried getting the data back using time travel. But since CREATE OR REPLACE drops and recreates the table, the time travel was not working.

SELECT * FROM {MY_TABLE} AT (OFFSET => -60*60*23);

Time travel data is not available for table {MY_TABLE}. The requested time is either beyond the allowed time travel period or before the object creation time.

I tried unropping the table using UNDROP {MY_TABLE} after renaming the current version of the table to something else. But no luck! I still don't see the data. I am wondering if there is any other way to get the data back as it is important since we follow incremental load instead of kill and fill.

ALTER TABLE {MY_TABLE} RENAME TO {MY_TABLE_BLANK};

UNDROP TABLE {MY_TABLE};

Solution

  • Never mind, I figured it out from the Snowflake community. Posting my answer here as it may help someone in the future.

    The solution is to restore the table N times by using UNDROP; and it only works if there is no table with the same name.

    N is a number of times the table is recreated using CREATE OR REPLACE; and time travel doesn't work as CREATE OR REPLACE drops the table and recreates it.

    I have created a table with some dummy data to test it.

    --------------------------------------------------------
    -- Creating Dummy Table with Data
    --------------------------------------------------------
    USE ROLE SYSADMIN;
    
    CREATE TABLE DW_STAGE.PUBLIC.DUMMY_DATA_TABLE (ID INT, NAME STRING);
    
    INSERT INTO DW_STAGE.PUBLIC.DUMMY_DATA_TABLE (ID, NAME)
    SELECT SEQ4()+1, CONCAT('DATA-' , TO_CHAR(SEQ4(),'0000'),  UNIFORM(100000, 990000, RANDOM())) C 
    FROM TABLE(GENERATOR(ROWCOUNT => 100));
    
    SELECT * FROM DW_STAGE.PUBLIC.DUMMY_DATA_TABLE;
    

    Now I have to recreate the same table multiple times by changing DDL (can go with the same DDL as well). This will create multiple versions of tables when only latest one will be visible.

    ----------------------------------------------------------
    -- Recreating Table three times with an additional Column
    ----------------------------------------------------------
    CREATE OR REPLACE TABLE DW_STAGE.PUBLIC.DUMMY_DATA_TABLE (ID INT, NAME STRING, AGE INT);
    
    CREATE OR REPLACE TABLE DW_STAGE.PUBLIC.DUMMY_DATA_TABLE (ID INT, NAME STRING, AGE STRING);
    
    CREATE OR REPLACE TABLE DW_STAGE.PUBLIC.DUMMY_DATA_TABLE (ID INT, NAME STRING, DOB DATE);
    

    To look at these different versions of tables, we can query TABLE_STORAGE_METRICS. However, It requires sufficient privileges. I am using ACCOUNTADMIN to query this table.

    --------------------------------------------------------
    -- Table Storage Metrics
    --------------------------------------------------------
    -- This step is not required if you don't have account admin rights
    -- But if you have, you can see the N versions of the table where only one doesn't have a dropped date
    
    USE ROLE ACCOUNTADMIN;
    
    SELECT ACTIVE_BYTES, TABLE_CREATED AS CREATED_DATE, TABLE_DROPPED AS DROPPED_DATE, *
    FROM SNOWFLAKE.INFORMATION_SCHEMA.TABLE_STORAGE_METRICS
    WHERE TABLE_NAME IN ('DUMMY_DATA_TABLE') 
            AND TABLE_CATALOG = 'DW_STAGE'
            AND TABLE_SCHEMA = 'PUBLIC'
            AND TABLE_CREATED::DATE = CURRENT_DATE::DATE
    ORDER BY TABLE_CREATED DESC;
    

    Now simply, I need to rename the current version of the table to something else and restore the previous version of the table using UNDROP. Something I need to repeat N number of times until I can see the data in the table from the 1st version.

    --------------------------------------------------------
    -- Rename and Undrop Table Three Times
    --------------------------------------------------------
    -- Switching back to my previous role
    USE ROLE SYSADMIN;
    
    --------------------------------
    -- Restore To 3rd Version
    --------------------------------
    ALTER TABLE DW_STAGE.PUBLIC.DUMMY_DATA_TABLE 
    RENAME TO DW_STAGE.PUBLIC.DUMMY_DATA_TABLE_v03;
    
    UNDROP TABLE DW_STAGE.PUBLIC.DUMMY_DATA_TABLE;
    
    -- No Data
    SELECT * FROM DW_STAGE.PUBLIC.DUMMY_DATA_TABLE; 
    
    
    --------------------------------
    -- Restore To 2nd Version
    --------------------------------
    ALTER TABLE DW_STAGE.PUBLIC.DUMMY_DATA_TABLE 
    RENAME TO DW_STAGE.PUBLIC.DUMMY_DATA_TABLE_v02;
    
    UNDROP TABLE DW_STAGE.PUBLIC.DUMMY_DATA_TABLE;
    
    -- No Data
    SELECT * FROM DW_STAGE.PUBLIC.DUMMY_DATA_TABLE; 
    
    --------------------------------
    -- Restore To 1st Version
    --------------------------------
    ALTER TABLE DW_STAGE.PUBLIC.DUMMY_DATA_TABLE 
    RENAME TO DW_STAGE.PUBLIC.DUMMY_DATA_TABLE_v01;
    
    UNDROP TABLE DW_STAGE.PUBLIC.DUMMY_DATA_TABLE;
    
    -- Will have DATA
    SELECT * FROM DW_STAGE.PUBLIC.DUMMY_DATA_TABLE; 
    

    and I was able to see the data repeating this thrice as three times my table was recreated.

    --------------------------------------------------------
    -- Dropping off Dummy Tables
    --------------------------------------------------------
    
    DROP TABLE IF EXISTS DW_STAGE.PUBLIC.DUMMY_DATA_TABLE;
    
    DROP TABLE IF EXISTS DW_STAGE.PUBLIC.DUMMY_DATA_TABLE_v01;
    
    DROP TABLE IF EXISTS DW_STAGE.PUBLIC.DUMMY_DATA_TABLE_v02;
    
    DROP TABLE IF EXISTS DW_STAGE.PUBLIC.DUMMY_DATA_TABLE_v03;
    
    --------------------------------------------------------
    -- Checking If Tables are Dropped
    --------------------------------------------------------
    SELECT * 
    FROM DW_STAGE.INFORMATION_SCHEMA.TABLES
    WHERE TABLE_NAME ILIKE 'DUMMY_DATA_TABLE'
            AND TABLE_SCHEMA = 'PUBLIC';