I'm working on the Deployment of the Purview ADB Lineage Solution Accelerator. In step 3 of Install OpenLineage on Your Databricks Cluster section, the author is asking to run the following in thepowershell
to Upload the init
script and jar
to dbfs using the Databricks CLI.
dbfs mkdirs dbfs:/databricks/openlineage
dbfs cp --overwrite ./openlineage-spark-*.jar dbfs:/databricks/openlineage/
dbfs cp --overwrite ./open-lineage-init-script.sh dbfs:/databricks/openlineage/open-lineage-init-script.sh
Question: Do I correctly understand the above code as follows? If that is not the case, before running the code, I would like to know what exactly the code is doing.
openlineage
in the root directory of dbfs
powershell
command from the location where .jar
and open-lineage-init-script.sh
are locatedjar
and .sh
files from your local directory to the dbfs:/databricks/openlineage/
in dbfs
of Databricksdbfs mkdirs
is an equivalent of UNIX mkdir -p
, ie. under DBFS root it will create a folder named databricks
, and inside it another folder named openlineage
- and will not complain if these directories already exist.
and 3. Yes. Files/directories not prefixed with dbfs:/
mean your local filesystem. Note that you can copy from DBFS to local or vice versa, or between two DBFS locations. Just not between local filesystem only.