I'm new using docker and spark.
My docker-compose.yml file is
volumes:
shared-workspace:
services:
notebook:
image: docker.io/jupyter/all-spark-notebook:latest
build:
context: .
dockerfile: Dockerfile-jupyter-jars
ports:
- 8888:8888
volumes:
- shared-workspace:/opt/workspace
And the Dockerfile-jupyter-jars is:
FROM docker.io/jupyter/all-spark-notebook:latest
USER root
RUN wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.28/mysql-connector-java-8.0.28.jar
RUN mv mysql-connector-java-8.0.28.jar /usr/local/spark/jars/
USER jovyan
To it start up a run
docker-compose up --build
The server is up and running and I'm interested to use spark-sql, but it is throwing and error trying to connect to mysql server: com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
I can see the mysql-connector-java-8.0.28.jar in the "jars" folder, and I have used same sql instruction in apache spark non docker version and it works.
Mysql db server is also reachable from the same server I'm running the Docker.
Do I need to enable something to reach external connections? Any idea?
Reference: https://hub.docker.com/r/jupyter/all-spark-notebook
The docker-compose.yml and Dockerfile-jupyter-jars files were correct, since I was using mysql-connector-java-8.0.28.jar it requires a SSL or to disable explicitly.
jdbc:mysql://user:password@xx.xx.xx.xx:3306/inventory?useSSL=FALSE&nullCatalogMeansCurrent=true
I'm going to left this example for: Docker - all-spark-notebook with MySQL dataset