sqlsql-serversql-update

SQL Update affecting more rows than expected from a query


I worked on a SELECT query that returned 2 rows when I was testing on my db :

SELECT c.CandidateId
FROM Candidate c
INNER JOIN (
    SELECT CandidateId, MAX(UpdateDate) AS MaxDateStatut
    FROM StatusHistory
    GROUP BY CandidateId
) hs ON c.CandidateId = hs.CandidateId
INNER JOIN StatusHistory hs2 ON hs.CandidateId = hs2.CandidateId AND hs.MaxDateStatut = hs2.UpdateDate
INNER JOIN InstitutionCandidate ic ON ic.CandidateId = hs.CandidateId
WHERE hs2.StatusId = 2 AND ic.IsActive = 1

For the purpose of updating all of them with an UPDATE statement such as :

UPDATE InstitutionCandidate
SET IsActive = 0
WHERE CandidateId IN (
    SELECT c.CandidateId
    FROM Candidate c
    INNER JOIN (
        SELECT CandidateId, MAX(UpdateDate) AS MaxDateStatut
        FROM StatusHistory
        GROUP BY CandidateId
    ) hs ON c.CandidateId = hs.CandidateId
    INNER JOIN StatusHistory hs2 ON hs.CandidateId = hs2.CandidateId AND hs.MaxDateStatut = hs2.UpdateDate
    INNER JOIN InstitutionCandidate ic ON ic.CandidateId = hs.CandidateId
    WHERE hs2.StatusId = 2 AND ic.IsActive = 1
);

It said "20 rows affected". What could explain this? This is a local database so nobody else is using it. Did I somehow change the WHERE condition that I'm not realizing?

EDIT:

I'm testing this by Inserting a row such as this :

INSERT INTO InstitutionCandidate
VALUES (1, GETDATE(), GETDATE(), 74, 1)

after executing this, and then executing my UPDATE, I get a "21 rows affected" (then "0 rows affected" if I do it again), and then I execute the INSERT INTO again, then the UPDATE, and then I get "22 rows affected" and so on.


Solution

  • You can modify your initial select statement by just changing the SELECT clause to UPDATE ic SET ... to update the already-referenced InstitutionCandidate rows (aliased as ic). Something like:

    UPDATE ic
    SET IsActive = 0
    FROM Candidate c
    INNER JOIN (
        SELECT CandidateId, MAX(UpdateDate) AS MaxDateStatut
        FROM StatusHistory
        GROUP BY CandidateId
    ) hs ON c.CandidateId = hs.CandidateId
    INNER JOIN StatusHistory hs2 ON hs.CandidateId = hs2.CandidateId AND hs.MaxDateStatut = hs2.UpdateDate
    INNER JOIN InstitutionCandidate ic ON ic.CandidateId = hs.CandidateId
    WHERE hs2.StatusId = 2 AND ic.IsActive = 1
    

    Also, as an alternative to using the GROUP BY subquery and following JOIN to identify the latest status, you can use the CROSS APPLY (SELECT TOP 1 ... ORDER BY ...) pattern to get the latest status for each candidate.

    UPDATE ic
    SET IsActive = 0
    FROM Candidate c
    CROSS APPLY (
        SELECT TOP 1 sh.*
        FROM StatusHistory sh
        WHERE sh.CandidateId = c.CandidateId
        ORDER BY sh.UpdateDate DESC
    ) sh1
    INNER JOIN InstitutionCandidate ic ON ic.CandidateId = hs.CandidateId
    WHERE sh1.StatusId = 2 AND ic.IsActive = 1
    

    (Note that the sh1.StatusId = 2 condition needs to be outside of the CROSS APPLY because we want the latest status, regardless of value, and once we have it, filter the result set by StatusId = 2.)