apache-sparkvisual-studio-codedatabricksdatabricks-connectdatabricks-vscode-extension

VSCode Extension Databricks-Connect - Use SparkSession


I am using the Databricks VSCode extension for development in an IDE. The basic functionalities are all working well. I connected to an Azure Databricks workspace with Unity Catalog enabled, selected an active cluster (DBR 13.2) and configured the sync destination. I am able to execute code. Now I want to use Databricks Connect "V2" to run my code locally.

I have the following code:

from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()

However, when I run this, I get the following error:

RuntimeError: Only remote Spark sessions using Databricks Connect are supported. Could not find connection parameters to start a Spark remote session.

Am I missing something? I did my authentication once with the AZ CLI, once with a PAT. I also tried it on DBR 13.2 and 13.3, but all options failed.

Thanks!


Solution

  • Ok, that issue was fixed in the extension version 1.1.1 by exporting the SPARK_REMOTE environment variables that is needed for spark = SparkSession.builder.getOrCreate() to work.

    But please note that it will work only if you configure profile-based authentication, not for azure-cli or OAuth authentication - for them to work you need to instantiate the DatabricksSession instance that could be imported with from databricks.connect import DatabricksSession