I'm quite new to Databricks and Python, and one thing in particular has been really stumping me - I'd be really grateful if someone may be able to point me in the right direction.
I am trying to read a really simple DBF file within a Databricks notebook, using the dbfread library.
The file I’m trying to read is “people.dbf” (from here) which is used in many of the examples within the dbfread docs.
I have put this DBF file in the root of my DBFS: file_in_dbfs
But after importing the dbfread module, I get the error below when I try to read a .dbf file: dbfread_error
The file definitely exists, I can see it with dbutils.fs.ls, and if I pretend it's a CSV, I can see the contents with spark.read.csv: works_ok_with_dbutils
I've tried using a few other DBF reading modules too (dbf,geopandas,simpledbf) and get the exact same error message. I've also tried copying the file to the local filesystem, and an external location - same error.
Does anyone know what I'm doing wrong please?
When working with files on Databricks the way how you access them on DBFS depends on the context. More detailed explanation can be found here. I guess, under the hood, it uses os
module to open files so, I suggest you try with:
DBF("/dbfs/people.dbf")