azure-blob-storageazure-data-lakeazure-synapsepolybase

Polybase to Azure Data Lake Gen 2 Failing


I'm trying to connect from Azure SQLDW via Polybase to a Parquet file in Data Lake Gen 2. This is my code:

CREATE DATABASE SCOPED CREDENTIAL DSC_ServicePrincipal
    WITH IDENTITY = '1234567890@https://login.microsoftonline.com/1234567890/oauth2/token',
    SECRET = '1234567890'
GO

CREATE EXTERNAL DATA SOURCE [DS_ADLS] WITH (TYPE = HADOOP, 
        LOCATION = N'abfss://filesystem@storageacc.dfs.core.windows.net', 
        CREDENTIAL = DSC_ServicePrincipal)
GO

CREATE EXTERNAL FILE FORMAT [ParquetFileFormatSnappy] 
WITH (FORMAT_TYPE = PARQUET, DATA_COMPRESSION = N'org.apache.hadoop.io.compress.SnappyCodec')
GO

CREATE EXTERNAL TABLE [dbo].[DimDate]
  (
        [DateSKey]        int   not null,
      [Date]              date  not null,
      [Year]              int   not null,
      [Month]             int   not null,
      [Day]             int not null,
      [WeekOfYear]      int not null,
      [MonthNameShort]  varchar(50) not null,
      [MonthName]         varchar(50)   not null,
      [DayNameShort]      varchar(50)   not null,
      [DayName]         varchar(50) not null
  )
  WITH (DATA_SOURCE = [DS_ADLS],LOCATION = N'/PRESENTED/dimDate',FILE_FORMAT = [ParquetFileFormatSnappy],REJECT_TYPE = VALUE,REJECT_VALUE = 0)

The create external table fails to execute and the following error is returned:

Error occurred while accessing HDFS: Java exception raised on call to HdfsBridge_IsDirExist. Java exception message: HdfsBridge::isDirExist - Unexpected error encountered checking whether directoy exists or not: AbfsRestOperationException: HEAD https://xxxx.dfs.core.windows.net/xxxx?resource=filesystem&timeout=90 StatusCode=403 StatusDescription=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. ErrorCode= ErrorMessage=

The directory does exist and my service principal has access. I have confirmed this by using the same service principal from Databricks and reading the file without error.

I'm lost as to what I've done wrong.


Solution

  • The issue is that user-supplied oauth credentials are not currently supported for abfs/abfss.

    The supported authentication methods are: