pythonelasticsearchparallel-processingpyes

How to enhance parallelism in a python client for elasticsearch?


I am new to elasticsearch and I need to optimize a python client to do the search/indexing on a elasticsearch cluster. It seems to me that the bottleneck is the client itself, and that elasticsearch can handle more queries. I would like to know how I can make my program more optimal to enhance performance. Should I use multi-processing or multi-threading or there is a more elegant way to do the work. Thank you


Solution

  • If your ES server can easily handle multiple request you can use a ThreadPoolExecutor in order to run multiple queries concurrently.

    As the operation is mainly IO driven, using threads should be enough.