jdbcapache-camelcamel-jdbc

Apache Camel JDBC Polling and Idempotency


I'm building a simple camel route that should continuously poll a table and push data into ActiveMQ. Each poll should only pull data not previously pulled. The best way I can think of to do this is to keep track of the last successfully processed sequence id and then select items whose sequence id is greater than the previous one.

Is there a standard way to do this?


Solution

  • It's possible, in a multi user database, for rows with a lower sequence id to be committed after rows with a higher id (certainly in Oracle & SQLServer, I suspect in any dbms with transaction support). In this case simply keeping track of the last processed id could result in rows which are never processed.

    The simplest solution to the problem, if you have control over the schema and are the only thing processing this table, is to add some sort of ‘processed’ column to the table and update that column (as @Arnaud suggests).

    If this isn’t an option then there are 3 other mechanism I have considered for solving this issue:

    1. Add an on-insert trigger to your table to insert a record in a companion table which references your table and includes a processed column. Change your query to join the two tables, and mark the row as processed by updating the companion table in the same way as above.
    2. Create a companion table and insert rows in here to mark a row in your main table as processed. Your poll query will then need to look for rows which don’t exist in the companion table.
    3. Keep track of any missing sequence ids. Look for those explicitly on each poll.

    I opted for 2, as it was easier to implement in a DBMS independent fashion, and required just an insert to mark a row as processed.