I would like to know -
I got a few suggestions like changing my join statements in my SQL queries
Impala uses in-memory analytics engine so being minimilastic in every aspect does the trick.
distinct
, regexp
, IN
, concat/function in a join condition or filter can slow things down. Please make sure they are absolutely necessary and there is no way you can avoid them.