Is there any clever way to avoid making a costly query with an IN clause in cases like the following one?
I'm using Google App Engine to build a Facebook application and at some point I (obviously) need to query the datastore to get all the entities that belong to any of the facebook friends of the given user.
Suppose I have a couple of entities modeled as such:
class Thing(db.Model):
owner = db.ReferenceProperty(reference_class=User, required=True)
owner_id = db.StringProperty(required=True)
...
and
class User(db.Model):
id = db.StringProperty(required=True)
...
At some point I query Facebook to get the list of friends of a given user and I need to perform the following query
# get all Thing instances that belong to friends
query = Thing.all()
query.filter('owner_id IN', friend_ids)
If I did that, AppEngine would perform a subquery for each id in friend_ids
, probably exceeding the maximum number of subqueries any query can spawn (30).
Is there any better way to do this (i.e. minimizing the number of queries)?
I understand that there are no relations and joins using the datastore but, in particular, I would consider adding new fields to the User
or Thing
class if it helps in making things easier.
I don't think there's an elegant solution, but you could try this:
On the User model, use Facebook ID as the key name, and store each user's list of things in a ListProperty.
class Thing(db.Model):
...
class User(db.Model):
things = db.ListProperty(db.Key)
...
Entity creation would go like this:
user = User.get_or_insert(my_facebook_id)
thing = Thing()
thing.put()
user.things.append(thing.key())
user.put()
Retrieval takes 2 queries:
friends = User.get_by_key_name(friend_ids)
thing_keys = []
for friend in friends:
thing_keys.extend(friend.things)
things = db.get(thing_keys)