scalaapache-sparkjupyter-notebookapache-toree

running jupyter + Apache Toree 0.2.0 with spark 2.2 kernel generate error (Missing dependency 'object scala in compiler mirror')


Trying to run Apache Toree 0.2.0 on Jupyter Notebook with spark 2.2 and Scala 2.11 Generate the following error [Windows 10]:

(C:\Users\ale3s\Anaconda3) C:\Users\ale3s>jupyter notebook
[I 23:20:13.777 NotebookApp] sparkmagic extension enabled!
[I 23:20:13.874 NotebookApp] Serving notebooks from local directory: C:\Users\ale3s
[I 23:20:13.874 NotebookApp] 0 active kernels
[I 23:20:13.876 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/?token=2d7f006ac5f5a7d47f814f2bc13d3e84b3377847dfe575d6
[I 23:20:13.877 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 23:20:13.896 NotebookApp]

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:8888/?token=2d7f006ac5f5a7d47f814f2bc13d3e84b3377847dfe575d6
[I 23:20:14.132 NotebookApp] Accepting one-time-token-authenticated connection from ::1
[I 23:20:31.557 NotebookApp] Creating new notebook in
"Starting Spark Kernel with SPARK_HOME=C:\Users\ale3s\spark\spark-2.2.0-bin-hadoop2.7"
C:\Users\ale3s\spark\spark-2.2.0-bin-hadoop2.7\bin\spark-submit  --class org.apache.toree.Main "C:\ProgramData\jupyter\kernels\apache_toree_scala\lib\toree-assembly-0.2.0.dev1-incubating-SNAPSHOT.jar"  --profile C:\Users\ale3s\AppData\Roaming\jupyter\runtime\kernel-7d0fc349-bf0d-4dec-ab97-8b5236518654.json
[I 23:20:33.354 NotebookApp] Kernel started: 7d0fc349-bf0d-4dec-ab97-8b5236518654
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
(Scala,org.apache.toree.kernel.interpreter.scala.ScalaInterpreter@62679465)
(PySpark,org.apache.toree.kernel.interpreter.pyspark.PySparkInterpreter@6a988392)
(SparkR,org.apache.toree.kernel.interpreter.sparkr.SparkRInterpreter@1d71006f)
(SQL,org.apache.toree.kernel.interpreter.sql.SqlInterpreter@5b6813df)
17/09/30 23:20:37 WARN Main$$anon$1: No external magics provided to PluginManager!
17/09/30 23:20:39 WARN StandardComponentInitialization$$anon$1: Locked to Scala interpreter with SparkIMain until decoupled!
17/09/30 23:20:39 WARN StandardComponentInitialization$$anon$1: Unable to control initialization of REPL class server!
17/09/30 23:20:40 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[init] error: error while loading Object, Missing dependency 'object scala in compiler mirror', required by C:\Program Files\Java\jdk1.8.0_131\jre\lib\rt.jar(java/lang/Object.class)

Failed to initialize compiler: object scala in compiler mirror not found.
** Note that as of 2.8 scala does not assume use of the java classpath.
** For the old behavior pass -usejavacp to scala, or if using a Settings
** object programmatically, settings.usejavacp.value = true.

Failed to initialize compiler: object scala in compiler mirror not found.
** Note that as of 2.8 scala does not assume use of the java classpath.
** For the old behavior pass -usejavacp to scala, or if using a Settings
** object programmatically, settings.usejavacp.value = true.
Exception in thread "main" java.lang.NullPointerException
        at scala.reflect.internal.SymbolTable.exitingPhase(SymbolTable.scala:256)
        at scala.tools.nsc.interpreter.IMain$Request.x$20$lzycompute(IMain.scala:896)
        at scala.tools.nsc.interpreter.IMain$Request.x$20(IMain.scala:895)
        at scala.tools.nsc.interpreter.IMain$Request.headerPreamble$lzycompute(IMain.scala:895)
        at scala.tools.nsc.interpreter.IMain$Request.headerPreamble(IMain.scala:895)
        at scala.tools.nsc.interpreter.IMain$Request$Wrapper.preamble(IMain.scala:918)
        at scala.tools.nsc.interpreter.IMain$CodeAssembler$$anonfun$apply$23.apply(IMain.scala:1337)
        at scala.tools.nsc.interpreter.IMain$CodeAssembler$$anonfun$apply$23.apply(IMain.scala:1336)
        at scala.tools.nsc.util.package$.stringFromWriter(package.scala:64)
        at scala.tools.nsc.interpreter.IMain$CodeAssembler$class.apply(IMain.scala:1336)
        at scala.tools.nsc.interpreter.IMain$Request$Wrapper.apply(IMain.scala:908)
        at scala.tools.nsc.interpreter.IMain$Request.compile$lzycompute(IMain.scala:1002)
        at scala.tools.nsc.interpreter.IMain$Request.compile(IMain.scala:997)
        at scala.tools.nsc.interpreter.IMain.compile(IMain.scala:579)
        at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:567)
        at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
        at org.apache.toree.kernel.interpreter.scala.ScalaInterpreterSpecific$$anonfun$start$1.apply(ScalaInterpreterSpecific.scala:295)
        at org.apache.toree.kernel.interpreter.scala.ScalaInterpreterSpecific$$anonfun$start$1.apply(ScalaInterpreterSpecific.scala:289)
        at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)
        at org.apache.toree.kernel.interpreter.scala.ScalaInterpreterSpecific$class.start(ScalaInterpreterSpecific.scala:289)
[W 23:20:43.383 NotebookApp] Timeout waiting for kernel_info reply from 7d0fc349-bf0d-4dec-ab97-8b5236518654
        at org.apache.toree.kernel.interpreter.scala.ScalaInterpreter.start(ScalaInterpreter.scala:44)
        at org.apache.toree.kernel.interpreter.scala.ScalaInterpreter.init(ScalaInterpreter.scala:87)
        at org.apache.toree.boot.layer.InterpreterManager$$anonfun$initializeInterpreters$1.apply(InterpreterManager.scala:35)
        at org.apache.toree.boot.layer.InterpreterManager$$anonfun$initializeInterpreters$1.apply(InterpreterManager.scala:34)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.MapLike$DefaultValuesIterable.foreach(MapLike.scala:206)
        at org.apache.toree.boot.layer.InterpreterManager.initializeInterpreters(InterpreterManager.scala:34)
        at org.apache.toree.boot.layer.StandardComponentInitialization$class.initializeComponents(ComponentInitialization.scala:90)
        at org.apache.toree.Main$$anon$1.initializeComponents(Main.scala:35)
        at org.apache.toree.boot.KernelBootstrap.initialize(KernelBootstrap.scala:101)
        at org.apache.toree.Main$.delayedEndpoint$org$apache$toree$Main$1(Main.scala:40)
        at org.apache.toree.Main$delayedInit$body.apply(Main.scala:24)
        at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
        at scala.App$$anonfun$main$1.apply(App.scala:76)
        at scala.App$$anonfun$main$1.apply(App.scala:76)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
        at scala.App$class.main(App.scala:76)
        at org.apache.toree.Main$.main(Main.scala:24)
        at org.apache.toree.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/09/30 23:20:43 WARN Shell: Parent header is null for message C7B89F2771FB4175A5E6FAE9FF80648E of type comm_info_request
17/09/30 23:20:43 WARN Shell: Parent header is null for message BABC29AFE89A42838812F07446E26CDE of type comm_open
17/09/30 23:20:43 WARN Shell: Parent header is null for message 4A909AC3F2324F4F82A0E91F0E437C02 of type comm_open

Not sure what the problem exactly. Tried to add "settings.usejavacp.value = true" to ScalaInterpreter.scala, didn't work. The following is my run.bat file:

@REM
@REM     http://www.apache.org/licenses/LICENSE-2.0
@REM
@REM Unless required by applicable law or agreed to in writing, software
@REM distributed under the License is distributed on an "AS IS" BASIS,
@REM WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@REM See the License for the specific language governing permissions and
@REM limitations under the License
@REM
@echo off
setlocal

SET parent=%~dp0
FOR %%a IN ("%parent:~0,-1%") DO SET PROG_HOME=%%~dpa

IF "%SPARK_HOME%" == "" GOTO endprog
ECHO "Starting Spark Kernel with SPARK_HOME=%SPARK_HOME%"

FOR %%F IN (%PROG_HOME%lib\toree-assembly-*.jar) DO (
 SET TOREE_ASSEMBLY=%%F
 GOTO tests
)

:tests
    @REM disable randomized hash for string in Python 3.3+
    @REM SET TOREE_ASSEMBLY=%TOREE_ASSEMBLY:\=\\%
    SET PYTHONHASHSEED=0

IF "%SPARK_OPTS%" == "" GOTO toreeopts
SET SPARK_OPTS=%__TOREE_SPARK_OPTS__%

:toreeopts
    IF "%TOREE_OPTS%" == "" GOTO runspark
    SET TOREE_OPTS=%__TOREE_OPTS__%

:runspark
    ECHO %SPARK_HOME%\bin\spark-submit %SPARK_OPTS% --class org.apache.toree.Main "%TOREE_ASSEMBLY%" %TOREE_OPTS% %*
    %SPARK_HOME%\bin\spark-submit %SPARK_OPTS% --class org.apache.toree.Main %TOREE_ASSEMBLY% %TOREE_OPTS% %*
    GOTO :eof

:endprog
    echo "SPARK_HOME must be set to the location of a Spark distribution!"
GOTO :eof

Any help is appreciated. I'm new to all these stuff :)


Solution

  • well, I installed docker and downloaded a stack that has everything I need and more from here: https://hub.docker.com/r/jupyter/all-spark-notebook/