sqlmysqlperformancequery-optimization

Create multi table index/composite key


I have a query which contains 3 tables. If the fetched data length is small (about 50,000) then it works very fast (nearly under a second), but when the data starts exceeding it becomes slower (9 seconds for <150,000).

I have Indexing enabled for all the columns (except text datatype) for all the tables along with composite keys (while researching for the cure), but could not overcome this issue. Even though, I have pagination enabled and fetching 50 records per page.

CREATE TABLE `users` {
  `id` BIGINT UNSIGNED AUTO_INCREMENT,
  `user_name` VARCHAR(20) DEFAULT NULL,
  ...
  PRIMARY KEY (`id`),
  KEY `user_name`(`user_name`),
  ...
};

CREATE TABLE `transactions` {
  `user_id` BIGINT UNSIGNED,
  `UniquePaymentRequestId` VARCHAR(100),
  `TransactionNumber` VARCHAR(100),
  `CustomerNumber` VARCHAR(100),
  `OrderId` VARCHAR(40), 
  `Amount` DECIMAL(12,6),
  `UserCharge` DECIMAL(12,6),
  `DepartmentCharge` DECIMAL(12,6),
  `ServiceCode` INT UNSINGED,
  `payment_date` date GENERATED ALWAYS AS (cast(`creation_time` as date)) VIRTUAL,
  `creation_time` DATETIME,
  ...

  KEY `UniquePaymentRequestId`(`UniquePaymentRequestId`),
  KEY `TransactionNumber`(`TransactionNumber`),
  KEY `CustomerNumber`(`CustomerNumber`),
  KEY `OrderId`(`OrderId`),
  KEY `Amount`(`Amount`),
  KEY `UserCharge`(`UserCharge`),
  KEY `ServiceCode`(`ServiceCode`),
  KEY `DepartmentCharge`(`DepartmentCharge`),
  KEY `creation_time`(`creation_time`),
  KEY `payment_date` (`payment_date`),
  KEY `transactions` (
    `UniquePaymentRequestId`, `TransactionNumber`, `CustomerNumber`, `OrderId`, 
    `Amount`, `UserCharge`, `DepartmentCharge`, `creation_time`
  )
};

CREATE TABLE `service_departments` {
  `id` BIGINT UNSIGNED AUTO_INCREMENT,
  `title` VARCHAR(255),
  `type` VARCHAR(3) DEFAULT 'non',
  ...
  PRIMARY KEY (`id`),
  UNIQUE `title`(`title`),
  KEY `type`(`type`),
  ...
};

CREATE TABLE `services` {
  `service_department_id` BIGINT UNSIGNED,
  `title` VARCHAR(255),
  ...
  KEY `service_department_id`(`service_department_id`),
  UNIQUE `title`(`title`),
  ...
};
SELECT 
  U.user_name, S.department, S.service_code, S.service_name, 
  T.UniquePaymentRequestId, T.TransactionNumber, T.CustomerNumber, T.OrderId, 
  T.Amount, T.UserCharge, T.DepartmentCharge, T.creation_time 
FROM transactions AS T 
INNER JOIN ( 
  SELECT user_id, user_name FROM users 
  WHERE user_type = 5 
) AS U ON U.user_id = T.user_id 
LEFT JOIN ( 
  SELECT SD.title AS department, S.service_code, S.title AS service_name FROM services AS S 
  INNER JOIN service_departments AS SD ON SD.type = "non" AND SD.id = S.service_department_id 
) AS S ON S.service_code = T.ServiceCode 
WHERE T.payment_date >= '2025-07-19' 
ORDER BY T.creation_time DESC 

EXPLAIN:

    id  select_type  table   type    possible_keys                       key                    key_len  ref                            rows    Extra                        
------  -----------  ------  ------  ----------------------------------  ---------------------  -------  -----------------------------  ------  -----------------------------
     1  SIMPLE       T       ALL     user_index,payment_date             (NULL)                 (NULL)   (NULL)                         206192  Using where; Using filesort  
     1  SIMPLE       users   eq_ref  PRIMARY,user_type                   PRIMARY                4        gramaone_production.T.user_id  1       Using where                  
     1  SIMPLE       SD      ref     PRIMARY,type                        type                   63       const                          8       Using where                  
     1  SIMPLE       S       ref     service_code,service_department_id  service_department_id  5        gramaone_production.SD.id      3       Using where                  

How do I resolve this issue?


Solution

  • A few observations.

    First, your transactions table is the big one. So we want to do something, anything, to use an index to hit that table.

    Second, your GENERATED column transactions.payment_date is unnecessary. Why? transaction_time >= '1942-12-07' gives precisely the same result as DATE(transaction_time) >= '1942-12-07'. This is important because you select on the index on your generated column and order by the index on your actual data column. That defeats the use of the index to skip a sort step.

    Third, "indexing enabled for all columns" is harmful to performance. Don't put indexes on columns, especially single columns, unless you know you need them to satisfy queries. They slow down inserts and updates.

    Fourth, you have a lot of nested SELECTs. Those often make things harder to read. (That's just a personal preference of mine, however, so take it or leave it.)

    So, what to do?

    This should help you access that table with an index scan.

    And drop that GENERATED column and all those extra indexes.