hadoophivemapreducehadoop-yarnapache-tez

How hive manage the Non-Tez and Non-MapReduce based queries


Create table t1(id int)

I was firing above query on Hive 2.3.6 (MapR Hadoop Distribution 6.3.0).

Default hive engine was tez. So after firing the query I was not able to see any TEZ application is launched on the yarn resource manager web ui

So I've changed the execution engine to MapReduce.

set hive.execution.engine=mr

And tried to run the same query again. Same I was not able to see any MR application was launched on the yarn resource manager web ui

So my questions are how hive manage such types of queries? And where the details of this queries are stored like application id, start time so on?


Solution

  • create table - is a metadata operation only, data is not being processed. It creates records in the metastore database, no distributed processing framework like Tez or MR is necessary for this, Yarn is not used.

    Compiler translates DDL to the metastore query only if possible.

    Also some simple DQL queries can be executed as metastore only if statistics exists and this feature is enabled: https://stackoverflow.com/a/41021682/2700344, without using Tez or MR.

    Also small tables can be queried without distributed framework, using fetch-only task, see this: Why is Fetch task in Hive works faster than Map-only task?