gemfiregeodespring-data-gemfirespring-boot-data-geode

Multiple data insertions using async writing with Apache Geode


We have Apache Geode connected to Postgres using an AEQ + AsyncCacheListener configured to write data to Postgres. During async write, we submit the list of events that we want to persist and it asynchronously inserts those events. Let's say I have two client applications which calls processEvents for async writing and both have some events in common which violate some key. But, after client calls processEvents, control is immediately returned to client. In such cases how will client know if some issue occurred? What are the best practices to tackle this?


Solution

  • What do you mean by the events in common "violate some key"? Like a primary or foreign key constraint, or some other database constraint perhaps (e.g. uniqueness, non-null values, etc)?

    Handling a conflict depends on the importance and nature of the data being inserted, or written to the backend (Postgres) database from Geode and its significance to the application, from a requirements and business logic POV.

    If 2 (or more) client applications are writing to the same cache/database entries/records, then certainly some type of collision will eventually occur, and how it is handled will depend on the data and the type of operation performed on the data.

    In general, handling the violation closer to where and when the violation occurs (e.g. inside the AsyncEventListener itself) maybe preferable or ideal, since then you should have most of the necessary information (e.g. DataAccessException, events, additional capabilities to query the DB) to deal with the situation.

    Inside the AEQ Listener, you could employ different strategies depending on the data and operation as determined by the application:

    You could employ Geode to conflate events stored in the AEQ for the same key, which should minimize collisions/conflicts.

    If the client (as in "client" in a client/server topology) needs to be informed, then you could write the failed events to another Region where a client registers a CQ to be notified when entries are written to this (failed events) Region. The client-side handler associated the CQ could then take the appropriate action, such as notifying the end-user, refreshing and then retrying the operation, and so on.

    Given the async nature of the initial write, then you can only respond asynchronously once the violation occurs. This is not unlike in a Reactive world (namely with onSuccess/onFailure event handlers).

    So, in this situation, I don't think there really is a "best practice" per-say, rather only "recommendations". For example, handling the situation as near to the actual occurrence of the violation as possible, since handling the violation usually involves having the necessary information readily available to make the best possible, informed decision on the right course of action.

    Sometimes you can automate the recovery, other times you might need manual intervention. Most definitely, do not guess. Clearly document your application/systems (configured) behavior when it can handle a situation and when it cannot.

    I don't think there is a general, 1 size fits all solution in this case.

    I hope this gives you some ideas to think about.