google-app-enginegoogle-cloud-datastoreentity-groups

When to use entity groups in GAE's Datastore


Following up on my earlier question regarding GAE Datastore entity hierarchies, I'm still confused about when to use entity groups.

Take this simple example:

This looks like a case where I could make Employee a child entity of Company, but what are the practical consequences? Does this improve scalability, hurt scalability, or have no impact? What are other advantages/disadvantages of using or not using an entity hierarchy?

(Entity groups enable transactions, but assume for this example that I do not need transactions).


Solution

  • Nick stated clearly that you should not make the groups larger than necessary, the Best practices for writing scalable applications has some discussion one why.

    Use entity groups when you need transactions. In the example you gave, a ReferenceProperty on employee will achieve a similar result.

    Aside from transactions, entity groups can be helpful because key-fetches and queries can be keyed off of a parent entity. However, you might want to consider multitenancy for these types of use-cases.

    Ultimately large entity groups might hurt scalability, entities within an entity group are stored in the same tablet. The more stuff you cram into one entity group, the more you reduce the amount of work that can be done in parallel -- it needs done serially instead.