databricksazure-databricksdatabricks-sql

How to Determine if functions are already installed on Databricks Apache Spark


We're experiencing extremely slow Databricks SQL queries. I have come across a site that provides a number of Spark SQL optimization tuning techniquess

https://www.linkedin.com/pulse/spark-sql-performance-tuning-configurations-vignesan-saravanan-8hamc/

A number of the recommendations described from the link suggest that the features/functions are already enabled by default. For example, the Spark Cost-Based Optimizer is enabled by default. However, it also mentions that if its not enabled you can enable it by running the following:

spark.conf.set("spark.sql.cbo.enabled", true)

My question are

  1. How do you determine of the feature/function is enabled
  2. Will the feature / function work with Databricks SQL notebook as opposed to Databricks Python notebook?

Solution

  • 1- you can test if a feature is enabled or not by calling the get method. spark.conf.getAll, (spark.sql.cbo.enabled is not present in my runtime).

    2- yes, this feature can be activated in an sql notebook you can create a python chunk, activate the option and use sql for the rest of the notebook, you can also activate this feature in the spark configuration when you are creating you cluster ( in the advanced options ).