I try to launch spark application by Oozie (Oozie version is 5.2.0). Spark version 3.0.0, Scala version 2.12.10, Java version 1.8: I got error: 2023-06-01 14:26:56,393 WARN ActionStartXCommand:523 - SERVER[tkles.dev.df.ru] USER[s_custom] GROUP[-] TOKEN[] APP[KafkaUnload] JOB[0000164-230530053026316-oozie-oozi-W] ACTION[0000164-230530053026316-oozie-oozi-W@runDataMartStreaming] Error starting action [runDataMartStreaming]. ErrorType [FAILED], ErrorCode [Unsupported action], Message [Action of type spark is not supported] org.apache.oozie.action.ActionExecutorException: Action of type spark is not supported
My workflow.xml
<workflow-app name="KafkaUnload_Maspers" xmlns="uri:oozie:workflow:0.5">
<global>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${yarn_queue}</value>
</property>
<property>
<name>oozie.action.max.output.data</name>
<value>102400000</value>
</property>
</configuration>
</global>
<start to="runDataMartStreaming"/>
<action name="runDataMartStreaming">
<spark xmlns="uri:oozie:spark-action:1.0">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>spark.sql.shuffle.partitions</name>
<value>100</value>
</property>
</configuration>
<master>yarn</master>
<mode>client</mode>
<name>Spark Example</name>
<class>ru.kafkaconnector.Main</class>
<jar>kafka-connector-init-0.0.0-jar-with-dependencies.jar</jar>
<spark-opts>-Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false</spark-opts>
<arg>--queue=${yarn_queue}</arg>
<arg>--principal=u_sklsdprozn_s_custom_rozn_productshelf@DEV.DF.SBRF.RU</arg>
<arg>--jar=spark-sql-kafka-0-10_2.12-3.0.0.jar</arg>
<arg>--jar=kafka-clients-3.0.0.jar</arg>
<arg>--keytab=s_custom.keytab</arg>
<arg>--files=/etc/krb5.conf</arg>
<file>${wf:appPath()}/lib/kafka-connector-init-0.0.0-jar-with-dependencies.jar</file>
<file>${wf:appPath()}/lib/spark-sql-kafka-0-10_2.12-3.0.0.jar</file>
<file>${wf:appPath()}/lib/kafka-clients-3.0.0.jar</file>
</action>
</workflow-app>
Is there a possibility launch spark application as java action? Or should I do something additionally so that the application starts?
This Is related to Oozie and spark versions compatibility.
You can do the spark submit using Shell action instead of spark action. It'll works for you.
You can create a spark-submit.sh script file, copy it to hdfs and be sure about the script path in your workflow.xml.
<workflow-app name="[WF-DEF-NAME]"
xmlns="uri:oozie:workflow:0.3">
...
<action name="shellAction">
<shell
xmlns="uri:oozie:shell-action:0.1">
<job-tracker>jobTracker</job-tracker>
<name-node>nameNode</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>queueName</value>
</property>
</configuration>
<exec>${script_having_the_spark_submit}</exec>
<argument>${inputDir}</argument>
<capture-output/>
</shell>
<ok to="OK_action"/>
<error to="KO_action"/>
</action>
...
exemples of Oozie workflow using shell action
Good luck.