databricksdatabricks-connect

databricks-connect==14.3 does not recognize cluster


I have installed databricks-connect on Windows in Conda environment. There is only one command with the tool

>databricks-connect -h  
usage: databricks-connect.exe [-h] {test}

positional arguments:
  {test}

options:
  -h, --help  show this help message and exit

When I run the test command I get:

>databricks-connect test
* Checking Python version
* Creating and validating a session with the default configuration
<Config: host=https://adb-<NUMBERS>859.19.azuredatabricks.net, token=***, auth_type=pat>
Traceback (most recent call last):
  File "...\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "...\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "...\databricks-connect.exe\__main__.py", line 7, in <module>
    sys.exit(main())
  File "...\databricks\connect\cli.py", line 55, in main
    test()
  File "...\databricks\connect\cli.py", line 38, in test
    spark = DatabricksSession.builder.validateSession(True).getOrCreate()
  File "...\databricks\connect\session.py", line 390, in getOrCreate
    return self._from_sdkconfig(Config(), self._gen_user_agent(),
  File "...\databricks\connect\cache.py", line 53, in wrapper
    cache[cache_id] = func(*args, **kwargs)
  File "...\databricks\connect\session.py", line 436, in _from_sdkconfig
    raise Exception("Cluster id is required but was not specified.")
Exception: Cluster id is required but was not specified.

Is there an issue with the tool or my configurations? This is how .databrickscfg looks like:

[DEFAULT]
host = https://adb-<NUMBERS>859.19.azuredatabricks.net/
token = <HASH>232-2
jobs-api-version = 2.0

[test]
host = https://adb-<NUMBERS>859.19.azuredatabricks.net/
token = <HASH>232-2
jobs-api-version = 2.0

[acc]
host = https://adb-<NUMBERS>667.7.azuredatabricks.net/
token = <HASH>268-2
jobs-api-version = 2.0

[prod]
host = https://adb-<NUMBERS>558.18.azuredatabricks.net/
token = <HASH>36d-2
jobs-api-version = 2.0

and my .databricks-connect looks like:

[DEFAULT]
host = https://adb-<NUMBERS>859.19.azuredatabricks.net/
token = <NUMBERS>417-2
cluster_id = 0208-<NUMBERS>-th3jhcdp
org_id = <NUMBERS>66859
port = 15001

Solution

  • One solution could be that your .databrickscfg should contain the cluster-id per profile, e.g.:

    [DEFAULT]
    host       = https://adb-<NUMBERS>.<NUMBER>.azuredatabricks.net
    cluster_id = <NUMBERS>-<NUMBERS>-<NUMBERSANDLETTERS>
    token      = <TOKENHASH>
    

    You can add it manually or with running the command:

    databricks configure --configure-cluster --profile DEFAULT
    

    as per the documentation on Azure Databricks

    Notes: