mysqlquery-optimizationsql-delete

MySQL optimize a SQL for a table without indexes


The metadata table of Spring Batch will not be automatically cleaned, so I tried to clean it manually (provided that other batches processes are executed normally during the cleaning process).

When I tried to clean up the BATCH_JOB_EXECUTION_PARAMS table, I found that the table was locked.

The table structure is:

CREATE TABLE BATCH_JOB_EXECUTION_PARAMS  (
    JOB_EXECUTION_ID BIGINT NOT NULL ,
    PARAMETER_NAME VARCHAR(100) NOT NULL ,
    PARAMETER_TYPE VARCHAR(100) NOT NULL ,
    PARAMETER_VALUE VARCHAR(2500) ,
    IDENTIFYING CHAR(1) NOT NULL ,
    constraint JOB_EXEC_PARAMS_FK foreign key (JOB_EXECUTION_ID)
    references BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
) ENGINE=InnoDB;

The SQL statement is:

SELECT MAX(JOB_EXECUTION_ID) 
    FROM BATCH_JOB_EXECUTION 
    WHERE CREATE_TIME < <ANYTIME>;

DELETE FROM BATCH_JOB_EXECUTION_PARAMS 
    WHERE JOB_EXECUTION_ID <= <MAX_JOB_EXECUTION_ID>;

The Explain is:

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 DELETE BATCH_JOB_EXECUTION_PARAMS NULL ALL JOB_EXEC_PARAMS_FK NULL NULL NULL 10062 100 Using where

How can I optimize SQL to solve the problem of table locking? I don't know of a way to prevent a table lock from happening if the table without indexes.

Best Regards...


Solution

  • You could try using this statement to do your deletion.

    DELETE BATCH_JOB_EXECUTION_PARMS
      FROM BATCH_JOB_EXECUTION_PARMS
      JOIN BATCH_JOB_EXECUTION
         ON BATCH_JOB_EXECUTION.JOB_EXECUTION_ID = BATCH_JOB_EXECUTION_PARMS.JOB_EXECUTION_ID
     WHERE BATCH_JOB_EXECUTION.CREATE_TIME < <ANYTIME>
     LIMIT 100;
    

    This will delete up to 1000 rows from the PARMS table that match the expiration criterion you mentioned. Then, you repeat this statement until it deletes no rows.

    Why might this work? It limits the number of rows to handle in the delete transaction, which in turn may limit how long the table is locked.

    According to the Spring Batch doc I found the BATCH_JOB_EXECUTION.CREATE_TIME and BATCH_JOB_EXECUTION_PARMS.JOB_EXECUTION_ID columns already have indexes. That's good because those are the indexes that help your operation. But that doesn't agree with the table definition you showed us.

    If your <ANYTIME> is long enough ago that nothing more is happening with the relevant rows of the database, you could try using a so-called dirty read to do the deletion. Precede your statements with

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
    

    to get this (somewhat risky) optimization; it saves database overhead by ignoring transactions from other users of the database. (You can't do this with other peoples' money and other transaction-sensitive data.)