curlhttpsazure-databricksdatabricks-unity-catalogcollibra

Azure Databricks and connectivity to Serverless SQL Warehouse


All,

We are trying to connect to Azure Databricks Serverless SQLWarehouse from Collibra in order to get the Lineage from Unity Catalog.

We did create "App Registration" for the same and granted appropriate rights on UC. We also got the IP address whitelisted and verified via traceroute and the connectivity seems to be going through.

However, we tried testing the endpoint via Curl and it is failing with "{"error_code":403,"message":"Invalid access to Org: 5377825XX8869582"}"

At this point, I dont know if the access is reaching the endpoint. I have two questions:

[1] How do we check logs of Serverless SQL Warehouse if the connection has come through? [2] What can I do to address the issue?

Note that we tried connecting with the Generic JDBC connector into the Hivemetastore and that is working.

Thanks, grajee

[root@OurEdgeServer1 ~]# curl -v -u token https://adb-9999888877776666.2.azuredatabricks.net/api/2.1/unity-catalog/catalogs
Enter host password for user 'token':
*   Trying 22.42.XX.XX...
* TCP_NODELAY set
* Connected to adb-9999888877776666.2.azuredatabricks.net (22.42.XX.XX) port 443                                                                                                                                                              (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, [no content] (0):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Databricks Inc.; CN=*.azured                                                                                                                                                             atabricks.net
*  start date: Jun  4 00:00:00 2024 GMT
*  expire date: Oct 11 23:59:59 2024 GMT
*  subjectAltName: host "adb-9999888877776666.2.azuredatabricks.net" matched cer                                                                                                                                                             t's "*.2.azuredatabricks.net"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* Server auth using Basic with user 'token'
* Using Stream ID: 1 (easy handle 0x55f0fa6eb6f0)
* TLSv1.3 (OUT), TLS app data, [no content] (0):
> GET /api/2.1/unity-catalog/catalogs HTTP/2
> Host: adb-9999888877776666.2.azuredatabricks.net
> Authorization: Basic 11113333ZGFwaWQ1MTk0ZTZkNTUxMDhiZGQwYjZkYTRhY222288889999                                                                                                                                                          TI=
> User-Agent: curl/7.61.1
> Accept: */*
> 
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
< HTTP/2 403
< content-type: application/json; charset=utf-8
< content-length: 70
< x-databricks-reason-phrase: Invalid access to Org: 5377825XX8809580
< vary: Accept-Encoding
< date: Fri, 07 Jun 2024 23:23:40 GMT
< server: databricks
< 
* Connection #0 to host adb-9999888877776666.2.azuredatabricks.net left intact
{"error_code":403,"message":"Invalid access to Org: 5377825XX8809580"}[root@OurEdgeServer1 

Solution

  • You need to have a token generated using service principal, then use that token with authorization as bearer.

    But before that make sure your service principal or app having the permissions as mentioned in this document.

    If the caller is the metastore admin, all catalogs will be retrieved. Otherwise, only catalogs owned by the caller (or for which the caller has the USE_CATALOG privilege) will be retrieved

    Generate token using service principal with below command.

    oauth_token=$(curl -X POST https://adb-3014685692229370.10.azuredatabricks.net/oidc/v1/token -d grant_type=client_credentials -d scope=all-apis -u <client_id>:<client_seceret>  | jq -r '.access_token')
    

    You can get secret from databricks service principal tab.

    enter image description here

    Next, get the catalog details with bearer token authorization with below code.

    curl -v -H "Authorization: Bearer $oauth_token" -H 'Content-Type: application/json' https://adb-3014685692229370.10.azuredatabricks.net/api/2.1/unity-catalog/catalogs