I try to write Hive Sql like that
SELECT count(1), substr(date, 1, 4) as year
FROM ***
GROUP BY year
But Hive cannot recognize the alias name 'year', it complains that: FAILED: SemanticException [Error 10004]: Line 1:79 Invalid table alias or column reference 'year'
One solution(Hive: SELECT AS and GROUP BY) suggest to use 'GROUP BY substr(date, 1, 4)'.
It works! However in some cases the value I want to group by may be generated from multiple lines of hive function code, it's very ugly to write code like
SELECT count(1), func1(func2(..........................)) AS something
FROM ***
GROUP BY func1(func2(..........................))
Is there any clean way in Hive to do that? Any suggestions?
Specifying the position in Group By will solve your issue. This position number in Group By works even when SET hive.groupby.orderby.position.alias=true; (Hive 0.12)
SELECT count(1), substr(date, 1, 4) as year
FROM ***
GROUP BY 2;