I have a two tables, main_table & staging_table, main_table contains original data whereas staging_table contains the few of the updated records that I have to add into with main_table data, and for that I am using unique id -PersonID and arrival time - date Below is the query which I am able to execute into SQL
SELECT PersonID, LastName, FirstName, Address, City, max(date)
from
(
select PersonID, LastName, FirstName, Address, City, date from main_table
UNION
select PersonID, LastName, FirstName, Address, City, date from staging_table
) as t
GROUP by t.PersonID;
but while executing into AWS Athena, I am getting following error, SYNTAX_ERROR: '"LastName"' must be an aggregate expression or appear in GROUP BY clause
I suspect that the other columns might differ and you actually want the full record from the most recent date. If this is the case, use row_number()
:
select p.*
from (select p.*,
row_number() over (partition by personid order by date desc) as seqnum
from ((select PersonID, LastName, FirstName, Address, City, date
from main_table
) union all
(select PersonID, LastName, FirstName, Address, City, date
from staging_table
)
) p
) p
where seqnum = 1;
This select one row per PersonId
with the most recent date. The columns come from the most recent row.