I am trying to write a Dockerfile
that builds a container that leverages Databricks Conenect. So, I need to set-up and install Databricks Connect through Docker RUN
commands. I have the following:
FROM python:3.8
COPY requirements.txt /tmp/
RUN apt-get update\
&& apt-get install software-properties-common -y\
&& apt-get update\
&& apt-add-repository "deb http://security.debian.org/debian-security stretch/updates main"\
&& apt-get update\
&& apt-get install openjdk-8-jdk -y
RUN pip install --requirement /tmp/requirements.txt\
&& databricks-connect configure\
&& databricks-connect test
as a simplified example that produces my problem. The step: databricks-connect configure
prompts for license acceptance with default N
, and so throws the following error:
...
#14 1.345 Do you accept the above agreement? [y/N] Traceback (most recent call last):
#14 1.346 File "/usr/local/bin/databricks-connect", line 8, in <module>
#14 1.346 sys.exit(main())
#14 1.346 File "/usr/local/lib/python3.8/site-packages/pyspark/databricks_connect.py", line 281, in main
#14 1.346 configure()
#14 1.346 File "/usr/local/lib/python3.8/site-packages/pyspark/databricks_connect.py", line 119, in configure
#14 1.346 accept = input().strip()
#14 1.346 EOFError: EOF when reading a line
------
executor failed running [/bin/sh -c databricks-connect configure]: exit code: 1
How can I accept this automatically as part of the Docker build?
You need to use something like this (stolen from this demo), because besides accepting the license terms, you also need to provide other parameters:
echo "y
$(databricks_host)
$(databricks_token)
$(cluster_id)
$(org_id)
15001" | databricks-connect configure
Or you can just generate ~/.databricks-connect
file that is just JSON:
{
"host": "https://host",
"cluster_id": "cluster",
"org_id": "org_id",
"port": "15001"
}