hiveapache-tez

Is there any scenario where we wouldn't want to reuse tez containers?


I started with hive and tez some days back during one of my projects. During that time, I came across this property tez.am.container.reuse.enabled which is recommended to be kept as true by many sites. I understand it's due to :

But I can't think of any scenario where we would want this property to be disabled. I have been searching online for any such cases but I'm not able to find any.

Can anyone help me with this?


Solution

  • In terms of performance, there is no reason not to re-use the containers, Execution Efficiency section of this paper explains very well, and this is why the default value for this parameter is true.

    But, I think there are some cases which might explain why this feature is still configurable;

    Do not use the tez.queue.name configuration parameter because it sets all Tez jobs to run on one particular queue.

    Enabling this parameter improves performance by avoiding the memory overhead of reallocating container resources for every task. However, disable this parameter if the tasks contain memory leaks or use static variables.