i currently have a model fit using statsmodel OLS formula and I am trying to save this model to ADLS blob storage. '/mnt/outputs/' is a mount point I have created and I am able to read and write other files from this directory.
import statsmodels.formula.api as smf
fit = smf.ols(formula=f"Pressure ~ {cat_vars_int} + Speed + dose_time:Speed + Speed:log_curr_speed_time", data=df_train).fit()
path = f'/mnt/outputs/Models/20240406_M2.pickle'
fit.save(path)
However I am getting this error when I am saving. I am trying to write a new file not read an existing file, so i am not sure why i am getting this error. Any help would be great, thanks!
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/outputs/Models/20240406_M2.pickle'
Default the mount point will be under dbfs context, whenever you reference files without spark you need prefix path with /dbfs
.
So, save the file giving path like below.
path = f'/dbfs/mnt/outputs/Models/20240406_M2.pickle'
fit.save(path)
and whenever accessing via spark context give like below.
spark.read.csv("dbfs:/path_to_file")
Listing files.
Dbutils
display(dbutils.fs.ls(mount_point))
Output:
Python OS module
os.listdir("/dbfs/"+mount_point)
Learn more about handling files in databricks here.