performancegoogle-app-enginegoogle-cloud-datastoredatastore

Google Datastore bulk retrieve data using urlsafe


Is there a way in Google DataStore to bulk fetch entities using their urlsafe key values?

I know about ndb.get_multi([list]) which takes a list of keys and retrieves the entities in bulk which is more efficient. But in our case we have a webpage with a few hundred entities, embedded with the entities urlsafe key values. At first we were only doing operations on single entities, so we were able to use the urlsafe value to retrieve the entity and do the operation without much trouble. Now, we need to change multiple entities at once, and looping on them one by one does not sound like an efficient approach. Any thoughts?

Is there any advantage of using the entities key ID directly (versus the key urlsafe value)? get_by_id() in the documentation does not imply being able to get entities in bulk (takes only one ID).

If the only way to retrieve entities in bulk is using the entities key, yet, exposing the key on the webpage is not a recommended approach, does that mean we're stuck when it comes to bulk operations on a page with a few hundred entities?


Solution

  • The keys and the urlsafe strings are exactly in a 1:1 relationship. When you have one you can obtain the other:

    urlsafe_string = entity_key.urlsafe()
    entity_key = ndb.Key(urlsafe=urlsafe_string)
    

    So if you have a bunch of urlsafe strings you can obtain the corresponding keys and then use ndb.get_multi() with those keys to get all entities, modify them as needed then use ndb.put_multi() to save them back into the datastore.

    As for using IDs - that only works (in a convenient manner) if you do not use entity ancestry. Otherwise to obtain a key you need both the ID and the entity's parent key (or its entire ancestry) - it's not convenient, better use urlsafe strings in this case.

    But for entities with no parents (aka root entities in the respective entity groups) the entity keys and their IDs are always in a 1:1 relationship and again you can obtain one if you have the other:

    entity_key_id = entity_key.id()
    entity_key = ndb.Key(MyModel, entity_key_id)
    

    So again from a bunch of IDs you can obtain keys to use with ndb.get_multi() and/or ndb.put_multi().

    Using IDs can have a cosmetic advantage over the urlsafe strings - typically shorter and easier on the eyes when they apear in URLs or in the page HTML code :)

    Another advantage of using IDs is the ability to split large entities or to deal in a simpler manner with entities in a 1:1 relationship. See re-using an entity's ID for other entities of different kinds - sane idea?

    For more info on keys and IDs see Creating and Using Entity Keys.