How to check whether a file in HDFS location is exist or not, using Oozie?
In my HDFS location I will get a file like this test_08_01_2016.csv
at 11PM , on a daily basis.
I want check whether this file exist after 11.15 PM. I can schedule the batch using a Oozie coordinator job.
But how can I validate if the file exists in HDFS?
you can use EL expression in oozie like:
<decision name="CheckFile">
<switch>
<case to="nextOozieTask">
${fs:exists('/path/test_08_01_2016.csv')} <!--do note the path which should be in ''-->
</case>
<default to="MailActionFileMissing" />
</switch>
</decision>
You can also build the name of the file using simple shell script using capture output.