hadoophadoop-yarnclouderacloudera-cdh

Clear or delete YARN (MR2) application queue


I am using Cloudera Hadoop (CDH 5.16.2) for testing purpose. I ran the following map-reduce application two days ago:

yarn jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
wordcount \
-Dmapreduce.job.reduces=8 \
/user/bigdata/randomtext \
/user/bigdata/wordcount

Whenever i start the cluster and check the scheduler, it shows that there are submitted applications. I already tried the following command to kill them and the command output shows that it has killed all applications but later all of them again start showing up.

for x in $(yarn application -list -appStates ACCEPTED | awk 'NR > 2 { print $1 }'); do yarn application -kill $x; done

Here's the content of fair-scheduler.xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<allocations>
    <queue name="root">
        <schedulingPolicy>drf</schedulingPolicy>
        <queue name="default">
            <schedulingPolicy>drf</schedulingPolicy>
        </queue>
    </queue>
    <queuePlacementPolicy>
        <rule name="specified" create="false"/>
        <rule name="default" create="true"/>
    </queuePlacementPolicy>
</allocations>

enter image description here

Just wanted to understand what's going on and how can i kill them as it's just a test cluster.


Solution

  • In my case, I finally figured out that my cluster was actually attacked. It happened because the Azure Network Security Group (NSG) was not configured properly. This also resulted in high-bandwidth charges (data transfer out) though I got that waived-off after requesting Azure team. After I restricted both the inbound and outbound traffic, everything got sorted. I killed the applications that were in queue and then they never appeared again.

    I was checking online and it seems Hadoop YARN-based remote code execution (RCE) are actually quite common. So kindly make sure your NSG is configured properly.

    Ref: