[SOLVED] Spark breaks when you need to make a very large shuffle

Spark breaks when you need to make a very large shuffle

I'm working with 1 terabytes of data, and at a moment I need to join two smaller dataframes, I don't know the size, but it has more than 200 GB and I get the error below.

The break occurs in the middle of the operation after 2 hours.

It seems to me to be a memory stick, but that doesn't make sense, because looking at the UI Spark Ganglia, the RAM memory doesn't reach the limit as shown in the print below.

Does anyone have any idea how I can solve this without decreasing the amount of data analyzed.

My cluster has: 1 x master node n1-highmem-32 4 x slave node n1-highmem-32

 [org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 482.1 failed 4 times, most recent failure: Lost task 3.3 in stage 482.1 (TID 119785, 10.0.101.141, executor 1): java.io.FileNotFoundException: /tmp/spark-83927f3e-4511-1b/3d/shuffle_248_72_0.data.f3838fbc-3d38-4889-b1e9-298f743800d0 (No such file or directory)
    at java.io.FileOutputStream.open0(Native Method)

        Caused by: java.io.FileNotFoundException: /tmp/spark-83927f3e-4511-1b/3d/shuffle_248_72_0.data.f3838fbc-3d38-4889-b1e9-298f743800d0 (No such file or directory)
      at java.io.FileOutputStream.open0(Native Method)
      at java.io.FileOutputStream.open(FileOutputStream.java:270)][1]

Solution

This types of errors typically occur when there are deeper problems with some tasks, like significant data skew. Since you don't provide enough details and job statistics the only approach that I can think off is to significantly increase number of shuffle partitions:

sqlContext.setConf("spark.sql.shuffle.partitions", 2048)