Using this code to read a multi-part parquet files with the prefix '/data/key' from a private s3 bucket, not from AWS
import dask as dd
dd.read_parquet(
's3://ns1/data/key',
storage_options={
'key': 'key',
'secret': 'secret',
'client_kwargs': {'endpoint_url': 'https://s3.sample-private-cloud.com'}
}
)
Why am I getting an error:
TypeError: 'coroutine' object is not iterable
I was able to download the file using boto3 client but unable to read it using dask. Dask documentation doesn't mention asynchronous process anywhere (await, async), so not sure why I am getting this error.
Using this code to read a multi-part parquet files with the prefix '/data/key'
If you are trying to load all files with a prefix 'data/key', you should add a * at the end of the pattern, like this 'data/key*'
:
import dask as dd
dd.read_parquet(
's3://ns1/data/key*',
storage_options={
'key': 'key',
'secret': 'secret',
'client_kwargs': {'endpoint_url': 'https://s3.sample-private-cloud.com'}
}
)