apache-sparkpysparkrdddatabrickspartitioning

Spark partition size greater than the executor memory


I have four questions. Suppose in spark I have 3 worker nodes. Each worker node has 3 executors and each executor has 3 cores. Each executor has 5 gb memory. (Total 6 executors, 27 cores and 15gb memory). What will happen if:


Solution

  • I answer as I know things on each part, possibly disregarding a few of your assertions:

    I have four questions. Suppose in spark I have 3 worker nodes. Each worker node has 3 executors and each executor has 3 cores. Each executor has 5 gb memory. (Total 6 executors, 27 cores and 15gb memory). What will happen if: >>> I would use 1 Executor, 1 Core. That is the generally accepted paradigm afaik.

    This an insightful blog: https://medium.com/swlh/spark-oom-error-closeup-462c7a01709d