hey guys how can solve for this error i am using synapse to read data from a blob with Hierarchical namespace Enabled but when i try to write back i get
: Operation failed: "An HTTP header that's mandatory for this request is not specified.", 400, PUT, https://businessblobtut.blob.core.windows.net/config/config_delta/ADF_CONFIG_TABLES/_delta_log?resource=directory&timeout=90, , ""
file_name = 'config_transport_excel.xlsx'
file_path =f'abfss://config@businessblobtut.blob.core.windows.net/config_excel/{file_name}'
save_path =f'abfss://config@businessblobtut.blob.core.windows.net/config_delta/ADF_CONFIG_TABLES'
pandas_df = pd.read_excel(file_path)
delta_table = spark.createDataFrame(pandas_df)
delta_table.write.format('delta').mode('append').save(save_path)
i want ro read the excel and convert it to delta formart and save withing the same blob
Try the following solution:
from pyspark.sql import SparkSession
import pandas as pd
# Initialize Spark session with ADLS Gen2 configurations
spark = SparkSession.builder \
.appName("ADLS Gen2 Write") \
.config("spark.hadoop.fs.azure.account.key.<your-storage-account-name>.dfs.core.windows.net", "<your-storage-account-access-key>") \
.getOrCreate()
file_name = 'config_transport_excel.xlsx'
file_path =f'abfss://config@businessblobtut.dfs.core.windows.net/config_excel/{file_name}'
save_path =f'abfss://config@businessblobtut.dfs.core.windows.net/config_delta/ADF_CONFIG_TABLES'
# Read the Excel file into a pandas DataFrame
pandas_df = pd.read_excel(file_path)
# Convert the pandas DataFrame to a Spark DataFrame
delta_table = spark.createDataFrame(pandas_df)
# Write the DataFrame to ADLS Gen2 in Delta format
delta_table.write.format('delta').mode('append').save(save_path)