elasticsearchmagento2

Influence of _cl tables on product import & incremental indexing


I use SQL Import functionality outside Magento 2 Bootstrap. It perfectly works, but I have a degradation. After 2M imported products daily import benchmark decreased in 3 times. After figuring out I got to conclusion that the problem is in _cl tables Here is what I have in _cl tables:

Table Record count
catalog_product_attribute_cl 135M
catalog_product_category_cl 66M
catalog_product_price_cl 75M
cataloginventory_stock_cl 62M
catalogrule_product_cl 154M
catalogsearch_fulltext_cl 171M
inventory_cl 4M

If I remove all data after every X number of imported products, would it properly work with incremental index through bin/magento cron:run. Would incremental functionality synchronize with ElasticSearch only data that appears in _cl tables & won't touch data that is already synchronized?


Solution

  • Yes, it would. I would would sync only data that was recently added. Finally due to large number of products & very stressful inefficient SELECT queries in Magento Index API we decided to refuse from using out of the box index functionality & implemented our custom methods for ElasticSearch outside Magento API