sqldb2ibm-ifs

Assign multiple values to multiple variables with unique requirements with one SQL Statement


I am working within DB2 and the IBM i. I am attempting to pull strings out of log files in our IFS using the QSYS2.IFS_READ table function. These log files have 5 pieces of data that I want and each day there are roughly 25 log files that generate. So, I have a cursor that iterates through each log file and then runs 6 queries against each log file pulling the data that I want based off of the record COUNT minus the static position of each piece of data.

This is the quickest way I could think to do this, but it runs for 23 minutes, which isn't quick.

I am imagining a CASE statement may work well?

Either way, my SQL is below:

BEGIN

DECLARE @path_name VARCHAR(28);
DECLARE @job VARCHAR(256);
DECLARE @run_date TIMESTAMP;
DECLARE @warnings VARCHAR(4);
DECLARE @errors VARCHAR(4);
DECLARE @job_complete VARCHAR(20);
DECLARE @elapsed VARCHAR(8);
DECLARE @rec_count INTEGER;

FOR @V1 AS @C1 CURSOR FOR
    SELECT PATH_NAME, CREATE_TIMESTAMP FROM TABLE(QSYS2.IFS_OBJECT_STATISTICS(
        START_PATH_NAME => '/<PATH_NAME>',
        SUBTREE_DIRECTORIES => 'YES'))
    WHERE REGEXP_LIKE(PATH_NAME, '\d{8}.LOG')
    AND CREATE_TIMESTAMP >= CURRENT_TIMESTAMP - 1 DAY
    ORDER BY PATH_NAME
DO
SET @rec_count = (SELECT COUNT(*) FROM TABLE(QSYS2.IFS_READ(
                PATH_NAME => @V1.PATH_NAME)));

SET @vault_job = (SELECT REPLACE((SUBSTR(LINE, 46, 10)), '/', '') FROM TABLE(QSYS2.IFS_READ(
                    PATH_NAME => @V1.PATH_NAME))
                    WHERE LINE_NUMBER = 5);

SET @errors = (SELECT SUBSTR(LINE, 78, 4) FROM TABLE(QSYS2.IFS_READ(
                PATH_NAME => @V1.PATH_NAME))
                WHERE LINE_NUMBER = (@rec_count - 18));

SET @warnings = (SELECT SUBSTR(LINE, 78, 4) FROM TABLE(QSYS2.IFS_READ(
                PATH_NAME => @V1.PATH_NAME))
                WHERE LINE_NUMBER = (@rec_count - 17));

SET @job_complete = (SELECT SUBSTR(LINE, 53, 20) FROM TABLE(QSYS2.IFS_READ(
                    PATH_NAME => @V1.PATH_NAME))
                    WHERE LINE_NUMBER = (@rec_count - 2));

SET @elapsed = (SELECT SUBSTR(LINE, 49, 8) FROM TABLE(QSYS2.IFS_READ(
                PATH_NAME => @V1.PATH_NAME))
                WHERE LINE_NUMBER = (@rec_count - 1));                

CALL SYSTOOLS.LPRINTF('----------------------------');
CALL SYSTOOLS.LPRINTF('Job: ' || @job);
CALL SYSTOOLS.LPRINTF('Path name: ' || @V1.PATH_NAME);
CALL SYSTOOLS.LPRINTF('Create Time: ' || @V1.CREATE_TIMESTAMP);
CALL SYSTOOLS.LPRINTF('errors: ' || @errors);
CALL SYSTOOLS.LPRINTF('warnings: ' || @warnings);
CALL SYSTOOLS.LPRINTF('job complete at ' || @job_complete);
CALL SYSTOOLS.LPRINTF('elapsed: ' || @elapsed);
END FOR;

END


Solution

  • Read File Once into a CTE:

    WITH FileContent AS (
        SELECT LINE, LINE_NUMBER
        FROM TABLE(QSYS2.IFS_READ(PATH_NAME => @V1.PATH_NAME))
    )
    SELECT ...
    FROM FileContent
    WHERE LINE_NUMBER IN (5, @rec_count - 18, @rec_count - 17);
    

    Process Data in a Single Query: Use CASE to extract all needed data from FileContent in one pass. This minimizes redundant reads.

    --------------------------- Edited -----------------------

    To resolve the issue of using CASE in combination with variables in DB2 SQL, it’s important to remember that CASE statements cannot directly set variables. Instead, you can use a query to assign values to variables via a SELECT INTO construct.

    Here’s a refined approach to process your data more efficiently while correctly handling the use of variables:

    BEGIN
    -- Declare variables
    DECLARE @path_name VARCHAR(28);
    DECLARE @job VARCHAR(256);
    DECLARE @run_date TIMESTAMP;
    DECLARE @warnings VARCHAR(4);
    DECLARE @errors VARCHAR(4);
    DECLARE @job_complete VARCHAR(20);
    DECLARE @elapsed VARCHAR(8);
    DECLARE @rec_count INTEGER;
    
    -- Cursor to iterate through log files
    FOR @V1 AS @C1 CURSOR FOR
        SELECT PATH_NAME, CREATE_TIMESTAMP
        FROM TABLE(QSYS2.IFS_OBJECT_STATISTICS(
            START_PATH => '/path/to/logs', 
            INCLUDE_FILES => 'YES'
        ))
    DO
        -- Get record count for the current file
        SELECT MAX(LINE_NUMBER) 
        INTO @rec_count
        FROM TABLE(QSYS2.IFS_READ(PATH_NAME => @V1.PATH_NAME));
    
        -- Use a single query to extract the desired data
        SELECT 
            MAX(CASE WHEN LINE_NUMBER = 5 THEN LINE END),
            MAX(CASE WHEN LINE_NUMBER = @rec_count - 18 THEN LINE END),
            MAX(CASE WHEN LINE_NUMBER = @rec_count - 17 THEN LINE END),
            MAX(CASE WHEN LINE_NUMBER = @rec_count - 16 THEN LINE END),
            MAX(CASE WHEN LINE_NUMBER = @rec_count - 15 THEN LINE END)
        INTO @job, @run_date, @warnings, @errors, @job_complete
        FROM TABLE(QSYS2.IFS_READ(PATH_NAME => @V1.PATH_NAME));
        
        -- Process or log variables as needed
        -- Example: INSERT INTO a log table
        INSERT INTO LOG_TABLE (JOB, RUN_DATE, WARNINGS, ERRORS, JOB_COMPLETE)
        VALUES (@job, @run_date, @warnings, @errors, @job_complete);
    END FOR;
    

    END