I'm writing a shell script that executes a hive command, writing the log and output information to two separate files:
hive -S -f pdr_extrator.sql 2> pdr_extrator_log.txt | sed 's / [\ t] / | / g' 1> pdr_extrator_out.txt
The log file, at the end of the execution, is as follows:
log4j: WARN No such property [maxBackupIndex] in org.apache.log4j.DailyRollingFileAppender. log4j: WARN No such property [maxFileSize] in org.apache.log4j.DailyRollingFileAppender. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar: file: /usr/hdp/2.2.6.0-2800/hadoop/lib/slf4j-log4j12-1.7.5.jar! /Org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar: file: /usr/hdp/2.2.6.0-2800/hive/lib/hive-jdbc-0.14.0.2.2.6.0-2800-standalone.jar! / Org / slf4j / impl / StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
When I run via the command line, it is possible to obtain the applicationID of my specific query, as shown below:
ApplicationID - Hive command line
I wonder if there is any way to get the applicationID via log.
Today I am using the command yarn application -list -appTypes TEZ
and monitoring the process that appears near the start of my query, to later use the command yarn application -status application_XXXXX
to monitor only my execution.
The problem is that this method is flawed, since another process may enter the queue at a similar time, for example.
Your help is appreciated.
You are running hive query file with -S
option which is suppressing logging related to yarn application id
.
Try to run
hive -f pdr_extrator.sql
You must be able to see logs like below on console or file if redirected.
Status: Running (Executing on YARN cluster with App id application_1579987899994_341626)