I have .csv file that looks like this:
"ID", "Name", "Extra Info"
"1", "John", "{\"Event\": \"Click\", \"Button Name\": \"Accept\"}
"2", "Adam", "{\"Event\": \"Click\", \"Button Name\": \"Accept\"}
I'm trying to load this file using this code in Synapse:
SELECT
TOP 2 *
FROM
OPENROWSET(
BULK 'https://[MY STORAGE ACCOUNT].dfs.core.windows.net/[FILE PATH]/[...]/*.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0'
)
AS [result]
Expecting this result:
ID | Name | Extra Info |
---|---|---|
1 | John | {"Event": "Click", "Button Name": "Accept"} |
2 | Adam | {"Event": "Click", "Button Name": "Accept"} |
But I keep getting this error:
Error handling external file: 'Unexpected token 'Event\' at [byte: XXX].
Expecting tokens ',', ' ', or '"'. '.
File/External table name: 'https://[MY STORAGE ACCOUNT].dfs.core.windows.net/[FILE PATH]/[...]/[SPECIFIC FILE NAME].csv'.
It looks like it's ignoring the first quote (") and Escape character in the Extra Info column? Leading to it think that \Event\ is some special token?
I just don't understand why or what I can do to fix this?
I think I found the answer based on this post and some of the Azure documentation:
It seems that the only valid way to escape Quotes is by using double quotes.
This means my .csv should be formatted like this:
"ID", "Name", "Extra Info"
"1", "John", "{""Event"": ""Click"", ""Button Name"": ""Accept""}
"2", "Adam", "{""Event"": ""Click"", ""Button Name"": ""Accept""}
Instead of the original (which uses ):
"ID", "Name", "Extra Info"
"1", "John", "{\"Event\": \"Click\", \"Button Name\": \"Accept\"}
"2", "Adam", "{\"Event\": \"Click\", \"Button Name\": \"Accept\"}
Unfortunately I don't see a way around this other than BULK editing all my .csv files...