file-uploaddropboxparquetdropbox-apiuploading

Sending dataframe as parquet file directly to dropbox


During my script I generate a dataframe which I want to upload as a parquet file directly to dropbox. I have managed to find such solution:

https://gist.github.com/MaxHalford/f17994c77bb775fdd04c9cd925e0b279

which helps me save a dataframe. However, I really want to send a parquet file directly.

The option which seemed intuitive to me:

fileToSave=tempDf.to_parquet('newFile.parquet')
upload_file(dbx, file_location, fileToSave)

but it throws

TypeError: expected str, bytes or os.PathLike object, not NoneType

any idea how it could be done otherwise?


Solution

  • When you call fileToSave=tempDf.to_parquet('newFile.parquet'), it saves the table in a local file called newfile.parquet and returns None

    What you want to do instead is:

    data = tempDf.to_parquet() #  bytes content of the parquet file
    dbx.files_upload(
            f=data,
            path=path,
            mode=dropbox.files.WriteMode.overwrite
        )
    

    This will convert the df into bytes (in memory) that you can then upload to dropbox.