azure-databricksazure-notebooks

What is the max limit on databricks text widgets


dbutils.widgets.text('input_query',"")
inquery= dbutils.widgets.get('input_query')

I tried to give a big string inside the text widget, it is not allowing to pass anything above 2048 characters. Is this a restriction in azure Dataricks text widgets

I have a scenario where a big select query string is passed as a base parameter from ADF via notebook activity. at the moment if my select query string is above 2048 characters , then it simply truncates anything above 2048 characters

Any thoughts on this ?


Solution

  • I've tried calling notebook with string of length 50000 characters it worked successfully both in adf and adb notebook run.

    enter image description here

    When give 5000000 length then i got an error.

    com.databricks.WorkflowException: com.databricks.common.client.DatabricksServiceHttpClientException: INVALID_PARAMETER_VALUE: The provided parameters (in json representation) are too long (5000033 bytes); it cannot exceed 1048576 bytes.

    That means you can pass the parameter of size maximum of 1048576 bytes.

    So below are the possible solution for passing sql query.

    As @Karthikeyan Rasipalay Durairaj said if your query is not dynamic then create a dictionary with key and value in the notebook you call, and pass the key in the parameters whichever query should be executed.

    Code in notebook need to be called.

    querys = {"1":"sql 1","2":"sql 2"}
    
    import sys
    dbutils.widgets.text('input_query',"")
    inquery= dbutils.widgets.get('input_query')
    f = f"{querys.get(inquery)} executed"
    dbutils.notebook.exit(f)
    

    call the notebook passing key as the parameter.

    r = dbutils.notebook.run("/Users/v-jgs@microsoft.com/sql",timeout_seconds=120,arguments={"input_query":"1"})
    print(r)
    

    enter image description here

    Or if you have a dynamic sql query you can try below approaches.

    1. you can pass only the dynamic part of the query.

    code:

    dbutils.widgets.text('input_query',"")
    inquery= dbutils.widgets.get('input_query')
    f = f"select * from {inquery}"
    dbutils.notebook.exit(f)
    

    output:

    enter image description here

    1. Splitting the query to more number and pass them as parameters.

    Let's assume below query.

    SELECT
        u.user_id,
        u.first_name,
        u.last_name,
        u.email,
        u.phone_number,
        o.order_id,
        o.order_date,
        o.total_amount,
        o.shipping_address,
        o.payment_method
    FROM
        users u
    JOIN
        orders o ON u.user_id = o.user_id
    WHERE
        o.order_date BETWEEN '2023-01-01' AND '2023-09-30'
    ORDER BY
        u.last_name ASC, u.first_name ASC, o.order_date DESC;
    
    

    I am splitting this at JOIN and combined inside notebook called. part1,part2 = query.split("JOIN") Code in notebook called.

    dbutils.widgets.text('input_query',"")
    dbutils.widgets.text('input_query_sub',"")
    inquery= dbutils.widgets.get('input_query')
    inquery_sub= dbutils.widgets.get('input_query_sub')
    
    f = "JOIN".join([inquery,inquery_sub])
    print(f)
    dbutils.notebook.exit("query executed")
    

    Output:

    enter image description here