sqlpivotsnowflake-cloud-data-platformunpivot

Dynamically UNPIVOTing table


I've wrote some experimental code that dynamically unpivots that table in Snowflake by re-casting all columns to VARCHAR and creating a list of columns using the LISTAGG function. The procedure works; however, I would like to ask if it's possible to improve the syntax by:

  1. Reducing the dependency on string manipulation inside the table
  2. Is it possible to replace the ugly REGEXP_REPLACE call with a more elegant loop-based solution that would recast all required columns?
/* Dynamically Unpivot Table
 *
 *  The following procedure can be used to dynamically unpivot table without specifying column names.
 *  The code, will re-cast the columns into the varchar type and return the table with column name and
 *  column value.
 */
 
-- Procedure definition
CREATE OR REPLACE PROCEDURE dynamic_unpivot(table_name TEXT)
    RETURNS TABLE(col_name STRING, col_val STRING)
    LANGUAGE SQL
    COMMENT = 'Dynamically unpivot table'
AS
DECLARE
    res RESULTSET;
    pivot_cols TEXT;
    pivot_cols_recast TEXT;
    recast_select_statement TEXT;
    select_statement TEXT;
BEGIN
    -- Capture columns to pivot through
    SHOW COLUMNS IN VIEW IDENTIFIER(:table_name);
    pivot_cols := (SELECT LISTAGG(DISTINCT "column_name", ', ') FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())));
    pivot_cols_recast := (REGEXP_REPLACE(pivot_cols, '(\\b\\w+\\b)', '\\1::VARCHAR AS \\1'));
    recast_select_statement := 'SELECT ' || :pivot_cols_recast || ' FROM ' || :table_name ;
    select_statement := 'SELECT * FROM (' || :recast_select_statement || ' ) t_recast' ||
                            ' UNPIVOT (col_val ' || ' FOR col_name ' || ' IN (' || :pivot_cols || '))';
    res := (EXECUTE IMMEDIATE :select_statement);
    RETURN TABLE(res);
END;


-- Tests
-- Create temporary table
CREATE OR REPLACE TEMPORARY TABLE tmp_t1 (col_a VARCHAR, col_b NUMBER);

-- Insert some sample data
INSERT INTO tmp_t1 (col_a, col_b) VALUES
   ('one', 1),
   ('two', 2),
   ('three', 3),
   ('four', 4);

-- Test query running procedure
CALL dynamic_unpivot('tmp_t1');
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())) ORDER by col_name;

-- Cleaning
DROP PROCEDURE IF EXISTS dynamic_unpivot(TEXT);
DROP TABLE IF EXISTS tmp_t1;

Solution

  • Instead of using unpivot for this, you can use snowflake semi-structured data handling.

    We can use object_constuct_keep_null(*) to gather all the columns in each row of the table into object representing the row. Then use lateral flatten to perform the 'UNPIVOT' operation. In my experience this is often performs faster than UNPIVOT as well but YMMV.

    The SQL looks like this:

    with wide as (select object_construct_keep_null(*) obj_row from tmp_t1)
    select key col_name, value::varchar col_val
    from wide, lateral flatten(input => obj_row)
    order by 1,2;
    

    If you still want this enclosed within a Stored-Procedure call it would look like this:

        CREATE OR REPLACE PROCEDURE dynamic_unpivot(table_name TEXT)
        RETURNS TABLE(col_name STRING, col_val STRING)
        LANGUAGE SQL
        COMMENT = 'Dynamically unpivot table'
    AS
    DECLARE
        res RESULTSET;
        select_statement TEXT;
    BEGIN
        select_statement := 'with wide as (select object_construct_keep_null(*) obj_row from '|| :table_name ||')'||
                            'select key col_name, value::varchar col_val '||
                            'from wide, lateral flatten(input => obj_row)'||
                            'order by 1,2';
        res := (EXECUTE IMMEDIATE :select_statement);
        RETURN TABLE(res);
    END;
    

    An alternative approach to keep the tablename dynamic, whilst returning a tabular output that you can use (i.e. instead of using TABLE(RESULT_SCAN(LAST_QUERY_ID())), is to use a session variable and identifier() to enclose it in the query body.

    SET table_name = 'tmp_t1';
    
    with wide as (select object_construct_keep_null(*) obj_row from identifier($table_name))
    select key col_name, value::varchar col_val
    from wide, lateral flatten(input => obj_row)
    order by 1,2;
    

    You can even turn this into a 'dynamic' view that you can query from, provided you have set the session variable to something before the view creation.

    create view dynamic_unpivot as
    with wide as (select object_construct_keep_null(*) obj_row from identifier($table_name))
    select key col_name, value::varchar col_val
    from wide, lateral flatten(input => obj_row);
    

    With this you can set table_name to any view/table that you want to unpivot and get the desired result. e.g.

    SET table_name = 'different_table';
    
    Select * from dynamic_unpivot order by 1,2;
    

    NOTE: If you try and query the view without the table_name variable set you will get an error. Session variable '$TABLE_NAME' does not exist