javaazure-databricksdatabricks-connect

Databrick-connect using the wrong Java version


I setup & configured databricks-connect in a conda env on windows 10. One of the prerequisites is having Java < 8 for it to work. I tried to install Java 8 and even Java 7 from here: https://www.oracle.com/java/technologies/javase/javase8-archive-downloads.html I changed afterwards the system enviroment varibales afterwards to the new Java 8 bin folder. However in the conda enviroment and when I run java --version or databricks-connect test I see that a newer version of Java is still being used. If I run databricks-connect test:

PS C:\Users> databricks-connect test
* PySpark is installed at C:\Users\Name\Anaconda3\envs\Conda_SD\lib\site-packages\pyspark
* Checking SPARK_HOME
* Checking java version
openjdk version "11.0.12" 2021-07-20
OpenJDK Runtime Environment Microsoft-25199 (build 11.0.12+7)
OpenJDK 64-Bit Server VM Microsoft-25199 (build 11.0.12+7, mixed mode)
WARNING: Java versions >8 are not supported by this SDK
* Skipping scala command test on Windows

If I check the version of Java in windows powershell:

Java --version
openjdk 11.0.12 2021-07-20
OpenJDK Runtime Environment Microsoft-25199 (build 11.0.12+7)
OpenJDK 64-Bit Server VM Microsoft-25199 (build 11.0.12+7, mixed mode)

If I type where Java in cmd, I get the following:

C:\Users\User1>where java
C:\Program Files\Microsoft\jdk-11.0.12.7-hotspot\bin\java.exe
C:\Java_jre1.8.0_202\bin\java.exe

There seems to be two paths for Java. The same command in windows powershell does not seems to show anything.

Am I missunderstanding something here? why after installing Java 8 a higher version is still being used?


Solution

  • The actual version of Java that gets used is the one that comes first in the PATH environment variable. Go to your system settings and change the PATH variable to bring Java 8 before Java 11.

    However, the problem is that Databricks Connect uses a very old version of Java, so you would typically only want to use Java 8 for Databricks Connect and a reasonably modern version (such as 11 or 17) for everything else. Therefore, create a wrapper batch file to run databricks-connect after setting Java 8 as the default for the current session:

    rem dc.bat
    set PATH=C:\Java_jre1.8.0_202\bin;%PATH%
    databricks-connect %1