pythonsnowflake-cloud-data-platform

Snowflake Python Worksheet - main handler with additional arguments


The goal is to develop and Deploy Snowpark code inside Python Worksheet code that could take user input.

If we try to provide additional user-defined arguments we get:

import snowflake.snowpark as snowpark

def main(session: snowpark.Session, param):
    df = session.table('snowflake_sample_data.tpch_sf10.lineitem').limit(param)
    return df

Handler has more arguments than expected. Function signature must have exactly one argument:

   def main(session: snowpark.Session):

If we try to Deploy the code to a stored procedure:

enter image description here

enter image description here

Stored procedure my_proc could not be created. failed running query: Python function is defined with 2 arguments (including session), but stored procedure definition contains 0 arguments. Python function arguments are expected to be session and stored procedure defined arguments in function MY_PROC with handler main


Solution

  • One way to workaround this limitation is to develop the code using Python Worksheet using local variables and leaving main(session: snowpark.Session) signature unchanged:

    import snowflake.snowpark as snowpark
    
    def main(session: snowpark.Session):
        param = 1
        df = session.table('snowflake_sample_data.tpch_sf10.lineitem').limit(param)
        return df
    

    When the code is ready and Deploy window is visible, "Open with Worksheets" should be chosen:

    enter image description here

    It will open SQL Worksheet:

    enter image description here

    create procedure my_proc()
        returns Table()
        language python
        runtime_version = 3.8
        packages =('snowflake-snowpark-python')
        handler = 'main'
        as 'import snowflake.snowpark as snowpark
    
    def main(session: snowpark.Session):
        param = 1
        df = session.table(''snowflake_sample_data.tpch_sf10.lineitem'').limit(param)
        return df';
    

    Then necessary code adjustments could be made including:

    1. removing local Python variables
    2. changing main function signature
    3. changing SQL stored procedure signature

    After:

    create procedure my_proc(param INT)
        returns Table()
        language python
        runtime_version = 3.8
        packages =('snowflake-snowpark-python')
        handler = 'main'
        as 'import snowflake.snowpark as snowpark
    
    def main(session: snowpark.Session, param:int):
        df = session.table(''snowflake_sample_data.tpch_sf10.lineitem'').limit(param)
        return df';
    

    enter image description here

    Now it is possible to use parametrized stored procedure:

    enter image description here