etlpentahokettlepentaho-spoonpdi

How to delete rows which were deleted in source table (Pentaho data integration)


I need to do an transformation which searches for id's coming from 2 different tables, and deletes from table 2 where the id of table two is not in table 1 (I'm using Pentaho)

That is harder than I thought, at least, o could not find the right step to use my logic.

Anyone have an idea on this?

(there are similar questions in here, but they were not useful for me)


Solution

  • You can use a Merge join step, defining an outer join by ID, so you retrieve all rows in Table2 even if there's no equivalent ID in Table1.

    You follow the Merge join by a Filter step, where you retrieve all the rows out of the Merge join with the ID from Table1 NULL.

    With that you have all the rows to delete.