I have files in local windows network file share path. I could access the file via Azure ADF using Self hosted IR. But we need to load those files via data bricks.
have tried below code
spark.read.csv('file:///networkpath/folder/', header="true", inferSchema="true")
Also tried loading the file via UI upload manually it's working fine.
But need know how to automate this file upload to DFS files system.
Unfortunately, Azure Databricks doesn't support connect Windows Network Share.
Note: It's is highly recommended: Do not Store any Production Data in Default DBFS Folders
There are multiple ways to upload files from a local machine to the Azure Databricks DBFS folder.
Method1: Using the Azure Databricks portal.
Method2: Using Databricks CLI
The DBFS command-line interface (CLI) uses the DBFS API to expose an easy to use command-line interface to DBFS. Using this client, you can interact with DBFS using commands similar to those you use on a Unix command line. For example:
# List files in DBFS
dbfs ls
# Put local file ./apple.txt to dbfs:/apple.txt
dbfs cp ./apple.txt dbfs:/apple.txt
# Get dbfs:/apple.txt and save to local file ./apple.txt
dbfs cp dbfs:/apple.txt ./apple.txt
# Recursively put local dir ./banana to dbfs:/banana
dbfs cp -r ./banana dbfs:/banana
Method3: Using third-party tool named DBFS Explorer
DBFS Explorer was created as a quick way to upload and download files to the Databricks filesystem (DBFS). This will work with both AWS and Azure instances of Databricks. You will need to create a bearer token in the web interface in order to connect.
Step1: Download and install DBFS Explorer and install it.
Step2: Open DBFS Explorer and Enter: Databricks URL and Personal Access Token
Step3: Select the folder where you want to upload the files from the local machine and just drag and drop in the folder to upload and click upload.