I have many UI clusters in a databricks workspace (Azure), each cluster has spark environment variables. In an automated flow (CI/CD pipelines) i want to add cluster_id value to the existing spark environment variable for each cluster.
I'm using az databricks workspaces list | ConvertFrom-Json
command to get a list of workspaces databricks clusters list --output JSON | ConvertFrom-Json
command to get all clusters' json configuration. Then i'm adding cluster_id property to the spark_env_vars.
My approach is to save retrieved JSON configuration to a temp file, using PowerShell add cluster_id value to the spark_env_vars section and then using databricks clusters edit --json @$tempFile
apply it to a cluster.
The problem i face is that the edit command throws warnings and is not applying an updated JSON. Warning list contains a lot of fields that are part of my JSON and the warning text is like:
It seems to me that those fields are read-only and that's why it fails but i wonder what is another approach to do that? I would like to avoid filtering-out in the code read-only fields and i hope there should be an easier way to add spark environment variable to the existing cluster.
Important is that there are already some other spark env. variables so i have to preserve them and add just cluster_id as an additional one.
In addition, at the end of the warnings list i get an error: Error: failed to reach RUNNING, got TERMINATED: Inactive cluster terminated (inactive for 20 minutes).
which seem to me like that edit databricks cli command is trying to start a cluster? - why, when in databricks documentation it is stated that
A cluster can be updated if it is in a RUNNING or TERMINATED state.
and my clusters are in Terminated state when i'm trying to edit their configuration.
Or maybe there is a way to retrieve only writable fields list? I was also thinking to try apply json with only spark environment variables but in the documentation read that i have to provide full json file, like if i provide only spark env vars it will fail/loose all other configuration.
Update: content of the temp JSON file i'm trying to apply:
{
"autoscale": {
"max_workers": 8,
"min_workers": 1
},
"autotermination_minutes": 20,
"azure_attributes": {
"availability": "ON_DEMAND_AZURE",
"first_on_demand": 1,
"spot_bid_max_price": -1
},
"cluster_id": "1111-11111-1kj1k2",
"cluster_name": "my-cluster-name",
"cluster_source": "UI",
"creator_user_name": "***",
"custom_tags": {
"cluster-purpose": "compute",
"driver-node-sku": "Standard_DS3_v2",
"service": "databricks-cluster",
"team": "dev",
"vm-purpose": "db_cluster",
"worker-node-sku": "Standard_DS3_v2"
},
"data_security_mode": "USER_ISOLATION",
"default_tags": {
"ClusterId": "1111-11111-1kj1k2",
"ClusterName": "my-cluster-name",
"Creator": "***",
"Vendor": "Databricks",
"application": "databricks",
"cost-category": "processing",
"disaster-recovery": "none",
"service-classification": "private"
},
"driver_node_type_id": "Standard_DS3_v2",
"enable_elastic_disk": true,
"enable_local_disk_encryption": false,
"last_restarted_time": 1844324754432,
"last_state_loss_time": 1844348533345,
"node_type_id": "Standard_DS3_v2",
"policy_id": "KDDSFFS8433D",
"spark_conf": {
"fs.azure.account.auth.type": "",
"fs.azure.account.oauth.provider.type": "",
"fs.azure.account.oauth2.client.endpoint": "",
"fs.azure.account.oauth2.client.id": "",
"fs.azure.account.oauth2.client.secret": ""
},
"spark_context_id": 34950873489578234,
"spark_env_vars": {
"AZ_RSRC_GRP_NAME": "rg",
"AZ_RSRC_NAME": "db-1",
"AZ_RSRC_PROV_NAMESPACE": "Microsoft.Databricks",
"AZ_RSRC_TYPE": "workspaces",
"AZ_SUBSCRIPTION_ID": "",
"AZ_TENANT_ID": "",
"CLIENT_ID": "",
"CLIENT_SECRET": "",
"DB_CLUSTER_NAME": "my-cluster-name",
"DCP_URL": "",
"DCR_ID": "",
"DCR_STREAMNAME": "",
"PROJ_ENV": "dev",
"LOG_ANALYTICS_WORKSPACE_ID": "54jh6m45",
"LOG_ANALYTICS_WORKSPACE_KEY": "",
"DB_CLUSTER_ID": "1111-11111-1kj1k2"
},
"spark_version": "14.3.x-scala2.12",
"start_time": 1844324754432,
"state": "TERMINATED",
"state_message": "Inactive cluster terminated (inactive for 20 minutes).",
"terminated_time": 1844348533345,
"termination_reason": {
"code": "INACTIVITY",
"parameters": {
"inactivity_duration_min": "20"
},
"type": "SUCCESS"
}
}
Interestingly that file that i retrieve with edit clusters
command to which i'm adding 1 line with the DB_CLUSTER_ID to the spark environment variables is 80 lines and if i go to the databricks workspace cluster and click view JSON is it just 57 lines, so edit
command i returning more properties that you can see on the cluster's page.
Output of get
is not a valid input to edit
.
Most databricks cli commands that take --json
argument, will accept the json that corresponding REST API will accept. In this case /api/2.1/clusters/edit
E.g. to add/update spark env vars.
PS: See the gotcha at bottom.
Create:
~ $ dbk clusters create --no-wait --json '{
"cluster_name": "my-cluster",
"spark_version": "14.3.x-scala2.12",
"node_type_id": "i3.xlarge",
"spark_env_vars": {
"my.name": "kash"
},
"aws_attributes": {
"first_on_demand": 1,
"availability": "SPOT",
"zone_id": "us-east-1"
},
"num_workers": 5
}'
{
"cluster_id":"1112-231831-3fzzepx1"
}
~ $ dbk clusters get 1112-231831-3fzzepx1 > old.json
~ $ grep -nA 2 spark_env_vars old.json
26: "spark_env_vars": {
27- "my.name":"kash"
28- },
--
39: "spark_env_vars": {
40- "my.name":"kash"
41- },
~ $
Update:
~ $ dbk clusters edit --json '{
"cluster_name": "my-cluster",
"cluster_id": "1112-231831-3fzzepx1",
"num_workers": 5,
"spark_version": "14.3.x-scala2.12",
"node_type_id": "i3.xlarge",
"spark_env_vars": {
"my.name": "1gentlemann",
"cluster_id": "1112-231831-3fzzepx1"
}
}'
Error: failed to reach RUNNING, got TERMINATED: Finding instances for new nodes, acquiring more instances if necessary
~ $ dbk clusters get 1112-231831-3fzzepx1 > new.json
~ $ grep -nA 3 spark_env_vars new.json
26: "spark_env_vars": {
27- "cluster_id":"1112-231831-3fzzepx1",
28- "my.name":"1gentlemann"
29- },
--
36: "spark_env_vars": {
37- "cluster_id":"1112-231831-3fzzepx1",
38- "my.name":"1gentlemann"
39- },
~ $
Little gotcha is that Databricks updates some things not mentioned in the update JSON. So change your update json to explicitly include all attributes you don't want Databricks to change.
~ $ diff -y old.json new.json
"availability":"SPOT", | "availability":"SPOT_WITH_FALLBACK",
"first_on_demand":1, | "first_on_demand":0,
> "cluster_id":"1112-231831-3fzzepx1",
"aws_attributes": { | "apply_policy_default_values":false,
"availability":"SPOT", <
"first_on_demand":1, <
"zone_id":"us-east-1" <
}, <
> "cluster_id":"1112-231831-3fzzepx1",
~ $