I need to create a csv file with Alteryx on Azure Datalake Gen2, reading from a table which has a string field that might contain commas. I'm using the "Azure Data Lake File Output" connector in the Alteryx flow (which, as far as I know, does not allow me to choose the csv delimiter)
Alteryx puts double quotes only on the string fields containing commas (it seems that Alteryx uses commas as separator automatically), but I need Alteryx to put double quotes on all string fields. I tried with a multifield formula
'"' + [_currentfield_] + '"'
but then the csv has 3 double quotes at the start and end of each string field (this because quotes are special characters).
Is there any way I can solve this? Or alternatively is there a way to define a specific delimiter in the "Azure Data Lake File Output" connector?
The problem arises because starting from the csv file we are going to create an external table in Azure Synapse Analytics using PolyBase. If a string column contains a string with commas, then Synapse recognizes that column as two columns, giving the "too many columns" error.
Thank you very much for helping!
Solved. There are two possible solutions, one Alteryx side and one Synapse side.
Alteryx side: the latest version 2.2.0 supports setting the CSV delimiter and can be downloaded at the following link https://community.alteryx.com/t5/Public-Community-Gallery/Azure-Data-Lake-Tools/ta-p/852262
Synapse side: in the csv double quotes must be on the string fields containing commas (the Alteryx "Azure Data Lake File Output" connector does this by itself by default); then use the external table script indicated in the following link, which ensure PolyBase to understand data inside quotes as an single value https://learn.microsoft.com/en-us/answers/questions/118102/polybase-load-csv-file-that-contains-text-column-w.html