domain-driven-designevent-sourcingdddd

How to ensure data consistency between two different aggregates in an event-driven architecture?


I will try to keep this as generic as possible using the “order” and “product” example, to try and help others that come across this question.

The Structure:

In the application we have 3 different services, 2 services that follow the event sourcing pattern and one that is designed for read only having the separation between our read and write views:
- Order service (write)
- Product service (write)
- Order details service (Read)

The Background:

We are currently storing the relationship between the order and product in only one of the write services, for example within order we have a property called ‘productItems’ which contains a list of the aggregate Ids from Product for the products that have been added to the order. Each product added to an order is emitted onto Kafka where the read service will update the view and form the relationships between the data.  

The Problem:

As we pull back by aggregate Id for the order and the product to update them, if a product was to be deleted, there is no way to disassociate the product from the order on the write side.   This in turn means we have inconsistency, that the order holds a reference to a product that no longer exists within the product service.

The Ideas:

  1. Master the relationship on both sides, which means when the product is deleted, we can look at the associated orders and trigger an update to remove from each order (this would cause duplication of reference).
  2. Create another view of the data that shows the relationships and use a saga to do a clean-up. When a delete is triggered, it will look up the view database, see the relationships within the data and then trigger an update for each of the orders that have the product associated.
  3. Does it really matter having the inconsistencies if the Product details service shows the correct information? Because the view database will consume the product deleted event, it will be able to safely remove the relationship that means clients will be able to get the correct view of the data even if the write models appear inconsistent. Based on the order of the events, the state will always appear correct in the read view.

    Another thought: as the aggregate Id is deleted, it should never be reused which means when we have checks on the aggregate such as: “is this product in the order already?” will never trigger as the aggregate Id will never be repurposed meaning the inconsistency should not cause an issue when running commands in the future.

Sorry for the long read, but these are all the ideas we have thought of so far, and I am keen to gain some insight from the community, to make sure we are on the right track or if there is another approach to consider.  

Thank you in advance for your help.


Solution

  • Event sourcing suites very well human and specifically human-paced processes. It helps a lot to imagine that every event in an event-sourced system is delivered by some clerk printed on a sheet of paper. Than it will be much easier to figure out the suitable solution.

    What's the purpose of an order? So that your back-office personnel would secure the necessary units at a warehouse, then customer would do a payment and you start shipping process.

    So, I guess, after an order is placed, some back-office system can process it and confirm that it can be taken into work and invoicing. Or it can return the order with remarks that this and that line are no longer available, so that a customer could agree to the reduced order or pick other options.

    Another option is, since the probability of a customer ordering a discontinued item is low, just not do this check. But if at the shipping it still occurs - then issue a refund and some coupon for inconvenience. Why is it low? Because the goods are added from an online catalogue, which reflects the current state. The availability check can be done on the 'Submit' button click. So, an inconsistency may occur if an item is discontinued the same minute (or second) the order has been submitted. And usually the actual decision to discontinue is made up well before the information was updated in the Product service due to some external reasons.

    Hence, I suggest to use eventual consistency. Since an event-sourced entity should only be responsible for its own consistency and not try to fulfil someone else's responsibility.