How to handle related entities in an event sourced aggregate

I'm adventuring myself in the world of ES and CQRS and I've been reading a lot about it. Unfortunately, most materials don't go beyond the basics and when you start trying to apply it to real-world examples you start stumbling into stuff not covered in the examples.

My question is regarding related entities. A very basic example would be the following:

Imagine we have products and categories. A product can have a single category, and a category can have several products. Some categories could add an x% to their product final prices.

Product is an aggregate that is event sourced and it should have a Category property. But thinking in DDD principles, I save only a CategoryId in it. Product also has a CalculateFinalPrice method that would take its base price and add the % from the category.

Category might not be an aggregate by itself, but since you have many products in one category, it doesn't make sense to save it together with Product, so they must be persisted separately from the Product aggregate.

My question is when I need to rebuild a Product from the events.

When the product is reconstructed from the events, I'll have only a CategoryId because that's what I saved. But for the domain logic, I need the Category.

Now if I need to perform some business logic based on some state of the category, I cannot.

I could have another 'complete' Product model, then I would construct it based on the original persisted Product, and for the related entities I would have to go through the whole list of categories, find the one needed and set the property in the 'complete' Product. I don't know, but something doesn't seem right with this approach.

What's the usual approach in such a situation?

Solution

Aggregate boundaries are driven by consistency boundaries, because that's ultimately all they are. If thinking about the problem in terms of commands (intentions to change the state of the system), a rough definition of consistency in an event-sourced persistence model is whether for two commands A (resulting in a collection of events X) and B (resulting in a collection of events Y) Y will depend on X or vice versa. If those two commands have a consistency relationship, that's a very strong signal that your domain wants the commands to be against the same aggregate ("wants" can be interpreted as, trying to make them be commands against different aggregates will likely result in a lot of pain).

So thinking in commands, there's likely going to be a SetCategorySurchargePercentage command that sets the percent discount for a category, and a SetProductBasePrice command that sets the (pre-surcharge) price for a product. Assuming that we want CalculateFinalPrice to be consistent with both SetCategorySurchargePercentage and SetProductBasePrice, that implies that all three operations (CalculateFinalPrice is probably actually a query, but consistency demands effectively turn it into a read-only command) want to be against the same aggregate: a category would then contain the price information for all the products in that category.

There are reasons this might not be ideal: for categories with a lot of products and/or a lot of price calculations, the sequencing is going to create a lot of contention. However, the consistency demands (or at least our understanding of those demands) pretty much forced us here (things like foreign key constraints in relational databases, it should be noted, hide this point of contention not eliminate it).

If instead there were different consistency demands, alternative solutions are possible. For instance, if we decided that a delay (even a possibly observable one) between SetCategorySurchargePercentage and the new percentage being reflected in CalculateFinalPrice, we can then have the former command be against a category aggregate and project the resulting event from a successful command into commands against the products in that category. The product aggregate would contain as part of its state the latest percentage it knows about.

Another approach, in keeping with CQRS, would be to have CalculateFinalPrice be answered by a read-model (e.g. a DB table of final prices). A process subscribes to the category percentage change events, the "this product is in this category" events, and the product base price events and makes appropriate updates to its table: note that if taking this approach there will now be a (possibly observable) delay between SetCategorySurchargePercentage or SetProductBasePrice and when CalculateFinalPrice reflects the result.