apache-sparkoozieoozie-workflow

Apache Oziee error: org.apache.oozie.action.ActionExecutorException: Action of type spark is not supported


I try to launch spark application by Oozie (Oozie version is 5.2.0). Spark version 3.0.0, Scala version 2.12.10, Java version 1.8: I got error: 2023-06-01 14:26:56,393 WARN ActionStartXCommand:523 - SERVER[tkles.dev.df.ru] USER[s_custom] GROUP[-] TOKEN[] APP[KafkaUnload] JOB[0000164-230530053026316-oozie-oozi-W] ACTION[0000164-230530053026316-oozie-oozi-W@runDataMartStreaming] Error starting action [runDataMartStreaming]. ErrorType [FAILED], ErrorCode [Unsupported action], Message [Action of type spark is not supported] org.apache.oozie.action.ActionExecutorException: Action of type spark is not supported

My workflow.xml

<workflow-app name="KafkaUnload_Maspers" xmlns="uri:oozie:workflow:0.5">
    <global>
        <configuration>
            <property>
                <name>mapred.job.queue.name</name>
                <value>${yarn_queue}</value>
             </property>
         <property>
                 <name>oozie.action.max.output.data</name>
                 <value>102400000</value>
             </property>
        </configuration>
    </global>

    <start to="runDataMartStreaming"/>

<action name="runDataMartStreaming">
    <spark xmlns="uri:oozie:spark-action:1.0">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <configuration>
            <property>
                <name>spark.sql.shuffle.partitions</name>
                <value>100</value>
            </property>
        </configuration>
        <master>yarn</master>
        <mode>client</mode>
        <name>Spark Example</name>
        <class>ru.kafkaconnector.Main</class>
        <jar>kafka-connector-init-0.0.0-jar-with-dependencies.jar</jar>
        <spark-opts>-Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false</spark-opts>
        <arg>--queue=${yarn_queue}</arg>
        <arg>--principal=u_sklsdprozn_s_custom_rozn_productshelf@DEV.DF.SBRF.RU</arg>
        <arg>--jar=spark-sql-kafka-0-10_2.12-3.0.0.jar</arg>
        <arg>--jar=kafka-clients-3.0.0.jar</arg>
        <arg>--keytab=s_custom.keytab</arg>
        <arg>--files=/etc/krb5.conf</arg>
        <file>${wf:appPath()}/lib/kafka-connector-init-0.0.0-jar-with-dependencies.jar</file>
        <file>${wf:appPath()}/lib/spark-sql-kafka-0-10_2.12-3.0.0.jar</file>
        <file>${wf:appPath()}/lib/kafka-clients-3.0.0.jar</file>
</action>
</workflow-app>

Is there a possibility launch spark application as java action? Or should I do something additionally so that the application starts?


Solution

  • This Is related to Oozie and spark versions compatibility.

    You can do the spark submit using Shell action instead of spark action. It'll works for you.

    You can create a spark-submit.sh script file, copy it to hdfs and be sure about the script path in your workflow.xml.

    <workflow-app name="[WF-DEF-NAME]"
    xmlns="uri:oozie:workflow:0.3">
    ...
    
    <action name="shellAction">
        <shell
            xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>jobTracker</job-tracker>
            <name-node>nameNode</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>queueName</value>
                </property>
            </configuration>
            <exec>${script_having_the_spark_submit}</exec>
            <argument>${inputDir}</argument>
            <capture-output/>
        </shell>
        <ok to="OK_action"/>
        <error to="KO_action"/>
    </action>
    ...
    

    Oozie documentation

    exemples of Oozie workflow using shell action

    Good luck.