azureazure-data-lake-gen2vaex

MainThread: Vaex: Error while Opening Azure Data Lake Parquet file


I tried to open a parquet on an Azure data lake gen 2 storage using SAS URL generated (with the datetime limit and token embedded in the url) using vaex by doing:

vaex.open(sas_url)

and I got the error

ERROR:MainThread:vaex:error opening 'the path which was also the sas_url(can't post it for security reasons)' ValueError: Do not know how to open (can't publicize the sas url) , no handler for https is known

How do I get vaex to read the file or is there another azure storage that works better with vaex?


Solution

  • Vaex is not capable to read the data using https source, that's the reason you are getting error "no handler for https is known".

    Also, as per the document, vaex supports data input from Amazon S3 buckets and Google cloud storage.

    Cloud support:

    Amazon Web Services S3

    Google Cloud Storage

    Other cloud storage options

    They mentioned that other cloud storages are also supported but there is no supporting document anywhere with any example where they are fetching the data from Azure storage account, that also using SAS URL.

    Also please visit API document for vaex library for more info.