databricksdatabricks-community-edition

Entering a proper path to files on DBFS


I uploaded files to DBFS:

/FileStore/shared_uploads/name_surname@xxx.xxx/file_name.csv

I tried to access them by pandas and I always receive information that such files don't exist. I tried to use the following paths:

/dbfs/FileStore/shared_uploads/name_surname@xxx.xxx/file_name.csv
dbfs/FileStore/shared_uploads/name_surname@xxx.xxx/file_name.csv
dbfs:/FileStore/shared_uploads/name_surname@xxx.xxx/file_name.csv
./FileStore/shared_uploads/name_surname@xxx.xxx/file_name.csv

What is funny, when I check them by dbutils.fs.ls I see all the files.

I found this solution, and I tried it already: Databricks dbfs file read issue

Moved them to a new folder:

dbfs:/new_folder/

I tried to access them from this folder, but still, it didn't work for me. The only difference is that I copied files to a different place.

I checked as well the documentation: https://docs.databricks.com/data/databricks-file-system.html

I use Databricks Community Edition.

I don't understand what I'm doing wrong and why it's happening like that. I don't have any other ideas.


Solution

  • The /dbfs/ mount point isn't available on the Community Edition (that's a known limitation), so you need to do what is recommended in the linked answer:

    dbutils.fs.cp(
      'dbfs:/FileStore/shared_uploads/name_surname@xxx.xxx/file_name.csv', 
      'file:/tmp/file_name.csv')
    

    and then use /tmp/file_name.csv as input parameter to Pandas' functions. If you'll need to write something to DBFS, then you do other way around - write to local file /tmp/..., and copy that file to DBFS.