I have a glue job generated from the wizard in the aws glue console. I have not changed the default script on generation of the task. It takes data from a posgres database table (source) and writes to another postgres database(target). I have selected enable bookmark in the ide. Whenever the task runs, it copies the full source database table to the target table even when there is no insert, update or delete in the source. I understand with the bookmark enabled, it should just copy changes in the source from the last run but this is not happening. So if there are 4 rows in the source table, every time the task runs it adds all 4 rows to the target and the row count of target increases by 1. How do I make it process only the chages to the source data from the last run? Further, how does it bookmark? If a row is modified (update sql statement)between 2 runs, how will it only "update" the correct row?
Bookmarks only work when copying data between two S3 endpoints. JDBC/ODBC is not supported.