mysqlsqlproceduresql-function

MySQL Cursor jumps out before it reaches the end


This is part of my mysql procedure:

declare myCursor cursor for select body,id from posts where body like '%Possible Duplicate%' limit 10 offset 11;
DECLARE CONTINUE HANDLER 
    FOR NOT FOUND SET finished = 1;

open myCursor;
myloop: loop
    fetch myCursor into mybody, myid;
    if finished = 1 then leave myloop;
    end if;
    set Dupdup = 1;
    set mytitle = regexp_substr(mybody,'.+?(?=</a>)');
    select id into Dupdup from posts where title like mytitle;

    update posts set dupicateId = Dupdup, isDuplicate = 1 where id = myid;
    
end loop myloop;
close myCursor;

I have very weird problem. Whenever select id into Dupdup from posts where title like mytitle; returns no id then it exits the loop on the next iteration. I suppose DECLARE CONTINUE HANDLER FOR NOT FOUND SET finished = 1; gets executed when select id into Dupdup from posts where title like mytitle; returns no id. I can't think of any other reason.

Is there anyway to solve this problem?


Solution

  • I had a similar issue, and it was due to the SELECT ... INTO query inside the loop finding no records for a certain iteration. For some reason, MySQL will break out of the loop when this occurs.

    Try this: Determine the last row that the cursor is on before it stops looping. You can add this query as the first line inside your loop:

    SELECT mybody, myid;
    

    And run it (you may want to comment out the UPDATE query). The last result printed will be the problematic row. Now, take the value from that row and try running your SELECT ... INTO query on its own with that value inserted. Remove the 'INTO' clause:

    SELECT id FROM posts WHERE title LIKE 'problematic value';
    

    I'd bet you get no results after running it.

    The solution? Use IFNULL to ensure that at least 1 record is always returned from the SELECT ... INTO. And use an IF statement to only run your UPDATE if the value returned is considered valid. Take a look:

    declare myCursor cursor for select body,id from posts where body like '%Possible Duplicate%' limit 10 offset 11;
    DECLARE CONTINUE HANDLER 
    FOR NOT FOUND SET finished = 1;
    
    open myCursor;
    myloop: loop
        fetch myCursor into mybody, myid;
        if finished = 1 then leave myloop;
        end if;
        set Dupdup = 1;
        set mytitle = regexp_substr(mybody,'.+?(?=</a>)');
    
        /* IMPORTANT PART */
        SELECT IFNULL((select id from posts where title like mytitle), -1)
        into Dupdup;
    
        IF Dupdup <> -1 THEN
            update posts set dupicateId = Dupdup, isDuplicate = 1 where id = myid;
        END IF;
        /* --- */
        
    end loop myloop;
    close myCursor;
    

    But maybe you have multiple columns that you want to select into multiple variables, and then all the IFNULLs and subqueries gets slow and messy. Then you could implement a query that first counts the results, and only runs the SELECT ... INTO if the count is not 0:

    declare myCursor cursor for select body,id from posts where body like '%Possible Duplicate%' limit 10 offset 11;
    DECLARE CONTINUE HANDLER 
    FOR NOT FOUND SET finished = 1;
    
    open myCursor;
    myloop: loop
        fetch myCursor into mybody, myid;
        if finished = 1 then leave myloop;
        end if;
        set Dupdup = 1;
        set mytitle = regexp_substr(mybody,'.+?(?=</a>)');
    
        /* IMPORTANT PART */
        select COUNT(*) from posts where title like mytitle
        into recordsfound;
    
        IF recordsfound <> 0 THEN
            select id, col1, col2 from posts where title like mytitle
            INTO Dupdup, col1value, col2value;
    
            update posts
            set
              dupicateId = Dupdup,
              col1 = col1value,
              col2 = col2value,
              isDuplicate = 1
            where id = myid;
        END IF;
        /* --- */
        
    end loop myloop;
    close myCursor;
    

    Disclaimer: I have not tested this code in OP's environment.