I have a Scala Spark application and want to invoke pySpark/python (pyspark_script.py) for further processing.
There are multiple resources to use Java/Scala code in Python but I am looking for scala->Pyspark
I explored Jython for Scala/Java to include Python code as follows:
PythonInterpreter.initialize(System.getProperties, properties, sysArgs)
val pi = new PythonInterpreter()
pi.execfile("path/to/pyscript/mypysparkscript.py")
I see error that says: "ImportError: No module named pyspark"
Is there any way on how Scala spark can talk to PYSpark with same sparkContext/session?
You can run shell commands in scala using process object.
// Spark codes goes here .....
// Call pyspark code
import sys.process._
"python3 /path/to/python/file.py.!!
To use same session add below line to python file.
spark = SparkSession.builder.getOrCreate()
You can use getActiveSession() method also.
NOTE: Make sure you installed pyspark module.
You can do that by using pip3 install pyspark
command.