hadoopooziehadoop2cloudera-cdhoozie-coordinator

How to check whether the file exist in HDFS location, using oozie?


How to check whether a file in HDFS location is exist or not, using Oozie?

In my HDFS location I will get a file like this test_08_01_2016.csv at 11PM , on a daily basis.

I want check whether this file exist after 11.15 PM. I can schedule the batch using a Oozie coordinator job.

But how can I validate if the file exists in HDFS?


Solution

  • you can use EL expression in oozie like:

    <decision name="CheckFile">
             <switch>
                <case to="nextOozieTask">
                  ${fs:exists('/path/test_08_01_2016.csv')} <!--do note the path which should be in ''-->
                </case>
                <default to="MailActionFileMissing" />
             </switch>
    </decision>
    

    You can also build the name of the file using simple shell script using capture output.