I want to understand why we don't use the NOLOCK keyword at the end SQL queries that run in Impala, Hive or MySQL. For example, I use both Hive and Impala through the Cloudera workbench at work. Cloudera workbench does not recognise the NOLOCK keyword. Similarly, I have used MySQL before as well. Even there, NOLOCK wouldn't work. Why don't we need to use NOLOCK with Hive, Impala, MySQL or Big Query?
Disclaimer: I want to clarify that I have a decent understanding of the NOLOCK feature and its use cases. This question is related to SQL flavours that do not require NOLOCK.
First of all, let's clarify what is NOLOCK. NOLOCK table hint allows to override default transaction isolation level(READ COMMITTED). NOLOCK = READ UNCOMMITTED
transaction isolation level, that allows the query to see the data changes before committing the transaction which is changing the data.
Hive does not support such isolation level (READ UNCOMMITTED
) even in transaction mode. Oracle also does not support READ UNCOMMITTED
, so no need in NOLOCK
hint in these databases.
In Hive you can turn off concurrency.