I have data on Azure Storage Account in a Container called Staging.
Using the SourceSystem/Dataset hierarchy I also have date based folder per month. Inside the folder the data is stored in a delta format.
In data bricks I already have the the container mounted. I want to create a Delta Table whereby I can see ALL the data in the underlying folder printtransaction folder.
DROP TABLE IF EXISTS staging.default.ysoft_printtransaction;
CREATE TABLE staging.default.ysoft_printtransaction
USING delta
OPTIONS ('delta.checkpoint.writeStatsAsJson' 'false', 'delta.checkpoint.writeStatsAsStruct' 'true')
LOCATION 'abfss://staging@saccdssanpoc01.dfs.core.windows.net/ysoft/printtransaction/*/';
SELECT * FROM staging.default.ysoft_printtransaction;
This however does not work as I am getting an issue when reading from the table stating:
Obviously I don't want to create a table per partitioned folder.
What are my options?
As per information you provided these two folders are separate delta table. I agree with @Alex Ott as he said as these two folders are separate delta tables you need to read them separately then you can merge them and combine it using merge query in the different table.
It will not be able read the data from multiple delta tables in a single table.
Sample code to read each folder separately:
CREATE TABLE ysoft_printtransaction1
USING delta
Location 'abfss://containername@storageaccname.dfs.core.windows.net/sampledb/202113';
CREATE TABLE ysoft_printtransaction2
USING delta
Location 'abfss://containername@storageaccname.dfs.core.windows.net/sampledb/202110';
To merge table, you can use following command.
MERGE INTO target USING source
ON target.key = source.key
WHEN NOT MATCHED THEN INSERT *