sqlperformancepostgresqlcasesql-update

CASE vs Multiple UPDATE queries for large data sets - Performance


For performance what option would be better for large data sets that are to be updated?

Using a CASE statement or Individual update queries?

CASE Example:

UPDATE tbl_name SET field_name = 
CASE
    WHEN condition_1 THEN 'Blah'
    WHEN condition_2 THEN 'Foo'
    WHEN condition_x THEN 123
    ELSE 'bar'
END AS value

Individual Query Example:

UPDATE tbl_name SET field_name = 'Blah' WHERE field_name = condition_1
UPDATE tbl_name SET field_name = 'Foo' WHERE field_name = condition_2
UPDATE tbl_name SET field_name = 123 WHERE field_name = condition_x
UPDATE tbl_name SET field_name = 'bar' WHERE field_name = condition_y

NOTE: About 300,000 records are going to be updated and the CASE statement would have about 10,000 WHEN conditions. If using the individual queries it's about 10,000 as well


Solution

  • The CASE version.

    This is because there is a good chance you are altering the same row more than once with the individual statements. If row 10 has both condition_1 and condition_y then it will need to get read and altered twice. If you have a clustered index this means two clustered index updates on top of whatever the other field(s) that were modified were.

    If you can do it as a single statement, each row will be read only once and it should run much quicker.

    I changed a similar process about a year ago that used dozens of UPDATE statements in sequence to use a since UPDATE with CASE and processing time dropped about 80%.