performanceelasticsearchshardingelasticsearch-indices

Elasticsearch: choose indexing strategy for private search per user


For example, I have 1000 users. The data of each user is not big, maximum is 1GB. So I have 2 strategies for indexing.

My opinion is the second method is a lot faster because we don't need to add user_id into the query. The first method might be slower because it will go to many shards and at the same time, it must count user_id into the query.

However, there are some ref1 ref2 that they recommend we should keep the total number of shards relatively small.

In a practical environment, what is a good solution for my situation?


Solution

  • It's a waste of resource to create one index per user, especially if you have 1000+ users. if your app is successful and your user base grows, so will the count of indices and the number of shards as a result. Even with one shard per index, having 1000 shards is already using up quite a big amount of resources.

    It's much more efficient to have a single index and throw all your users in it with a user_id field to discriminate each user's data.