etltalenddata-manipulationab-initio

Ab initio component to stop the graph if duplicate rows/records found


Hi I have an Ab initio graph that after some data manipulation it loads them into a table. I am looking for some sort of validation component to end the process (before loading the data into the table) if it found duplicate rows.

The duplicate rows will have a unique ID but maybe I could ignore that column/part-of-the-record.


Solution

  • Pass the flow to dedup component.

    In Dedup component, select unique property for output. This will give you all the unique records.

    Now in case you have duplicate records, it will go thru the dup port. You can collect those record(s) in a intermediate file (for auditing purpose) and the process the graph as per your requirement.

    In case you want to abort the process just after finding all the duplicates, you can abort the process using the phasing.

    Also in case you don't want to have the records inserted in DB, if the input has duplicate records then you can just pass the key part to Dedup. It will make the processing faster.