I know we can create an External table in Azure SQL Data warehouse pointing to a LOCATION that is either a file path or a folder path. Can this file or folder path be based on a wild card pattern instead of an explicit path.
Here my file path is a location in Azure Data Lake Store.
-- Syntax for SQL Server
-- Create a new external table
CREATE EXTERNAL TABLE [ database_name . [ schema_name ] . | schema_name. ] table_name
( <column_definition> [ ,...n ] )
WITH (
**LOCATION = 'folder_or_filepath'**,
DATA_SOURCE = external_data_source_name,
FILE_FORMAT = external_file_format_name
[ , <reject_options> [ ,...n ] ]
)
[;]
Polybase / External Tables do not support wildcards at this time. Simply have one folder per external table you require. If you feel this is an important missing feature you can create a request and vote for it here:
https://feedback.azure.com/forums/307516-sql-data-warehouse
Bear in mind Polybase (in Azure SQL Data Warehouse) can now read files either in blob storage or in Azure Data Lake Storage (ADLS). Therefore as another workaround, Azure Data Lake Analytics (ADLA) and U-SQL support Polybase, so you could use U-SQL to move the files you want from blob store into your lake, eg
// Move data from blob store to data lake
// add filename and structure as one file
DECLARE @inputFilepath string = "wasb://someContainer@someStorageAccount.blob.core.windows.net/someFilter/{filepath}.csv";
DECLARE @outputFilepath string = "output/special folder/output.csv";
@input =
EXTRACT
... // your column list
filepath string
FROM @inputFilepath
USING Extractors.Csv()
@input =
SELECT * FROM @input
WHERE filename.Contains("yourFilter");
// Export as csv
OUTPUT @input
TO @outputFilepath
USING Outputters.Csv(quoting:false);
// Now the data is in Data Lake which Polybase can also use as a source