I need to configure the value of hadoop.service.shutdown.timeout
due to the shutdown hooks triggering a timeout when our MR jobs stop:
2023-08-25 08:44:39,566 [WARN] [Thread-0] [org.apache.hadoop.util.ShutdownHookManager] - ShutdownHook '' timeout, java.util.concurrent.TimeoutException
java.util.concurrent.TimeoutException
at java.util.concurrent.FutureTask.get(FutureTask.java:205)
at org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95)
The problem is that the value takes effect only when the value is set in core-site.xml
. For example, if we add it as a property to the env var YARN_OPTS (-D hadoop.service.shutdown.timeout
) or set it in the code (an instance of Configuration class which is passed to the ToolRunner
), even though the value changes(we checked this through logging), the timeout is triggered based on the value from core-site.xml
or default value if not specified in the file.
Is this property configurable only from the core-site.xml
config file ?
I believe the short answer is "yes", via core-site.xml
only, because Hadoop's ShutdownHookManager
creates a new Configuration object by reading core-site.xml
when shutting down an executor thread.
The long answer is longer but is still "probably yes", unless you find the way to follow a rather obscure advice in the comments to the aforementioned class to register a hook explicitly using addShutdownHook()
:
> * Unless a hook was registered with a shutdown explicitly set through
> * {@link #addShutdownHook(Runnable, int, long, TimeUnit)},
> * the shutdown time allocated to it is set by the configuration option
> * {@link CommonConfigurationKeysPublic#SERVICE_SHUTDOWN_TIMEOUT} in
> * {@code core-site.xml}, with a default value of
> * {@link CommonConfigurationKeysPublic#SERVICE_SHUTDOWN_TIMEOUT_DEFAULT}