dbutils.widgets.text('input_query',"")
inquery= dbutils.widgets.get('input_query')
I tried to give a big string inside the text widget, it is not allowing to pass anything above 2048 characters. Is this a restriction in azure Dataricks text widgets
I have a scenario where a big select query string is passed as a base parameter from ADF via notebook activity. at the moment if my select query string is above 2048 characters , then it simply truncates anything above 2048 characters
Any thoughts on this ?
I've tried calling notebook with string of length 50000 characters it worked successfully both in adf and adb notebook run.
When give 5000000 length then i got an error.
com.databricks.WorkflowException: com.databricks.common.client.DatabricksServiceHttpClientException: INVALID_PARAMETER_VALUE: The provided parameters (in json representation) are too long (5000033 bytes); it cannot exceed 1048576 bytes.
That means you can pass the parameter of size maximum of 1048576 bytes.
So below are the possible solution for passing sql query.
As @Karthikeyan Rasipalay Durairaj said if your query is not dynamic then create a dictionary with key and value in the notebook you call, and pass the key in the parameters whichever query should be executed.
Code in notebook need to be called.
querys = {"1":"sql 1","2":"sql 2"}
import sys
dbutils.widgets.text('input_query',"")
inquery= dbutils.widgets.get('input_query')
f = f"{querys.get(inquery)} executed"
dbutils.notebook.exit(f)
call the notebook passing key as the parameter.
r = dbutils.notebook.run("/Users/v-jgs@microsoft.com/sql",timeout_seconds=120,arguments={"input_query":"1"})
print(r)
Or if you have a dynamic sql query you can try below approaches.
code:
dbutils.widgets.text('input_query',"")
inquery= dbutils.widgets.get('input_query')
f = f"select * from {inquery}"
dbutils.notebook.exit(f)
output:
Let's assume below query.
SELECT
u.user_id,
u.first_name,
u.last_name,
u.email,
u.phone_number,
o.order_id,
o.order_date,
o.total_amount,
o.shipping_address,
o.payment_method
FROM
users u
JOIN
orders o ON u.user_id = o.user_id
WHERE
o.order_date BETWEEN '2023-01-01' AND '2023-09-30'
ORDER BY
u.last_name ASC, u.first_name ASC, o.order_date DESC;
I am splitting this at JOIN
and combined inside notebook called.
part1,part2 = query.split("JOIN")
Code in notebook called.
dbutils.widgets.text('input_query',"")
dbutils.widgets.text('input_query_sub',"")
inquery= dbutils.widgets.get('input_query')
inquery_sub= dbutils.widgets.get('input_query_sub')
f = "JOIN".join([inquery,inquery_sub])
print(f)
dbutils.notebook.exit("query executed")
Output: