sqlpostgresqlsql-update

Why does UPDATE FROM silently fail when referring to fieds with alias?


Yesterday I discovered something strange about SQL, or at least PostreSQL. Take a look below and explain why first query does nothing and the second one works just right:

-- this silently does nothing
update bodycontent 
    set body = replace(body, '~' || u.oldusername, '~' || u.newusername)
    from usermigration u;

-- this works as expected
update bodycontent 
    set body = replace(body, '~' || oldusername, '~' || newusername)
    from usermigration u;

Update: I think that everyone is missing the point on this question, the Cartesian product was an original intent: there are going to be N x M updates and this is by design.

I need to replace all the pairs of usernames existing in the migration table in each row from bodycontent.

And, I repeat, the second version works as expected, but the first one does no update. All I wanted to know was why.

| usermigration table    |
--------------------------
oldusername | newusersname
--------------------------
johndoe     | johnd
john.smith  | johnsmith

Is this a bug in PostgreSQL?


Solution

  • You are missing a WHERE clause (which cannot be written as direct JOIN condition). The additional table usermigration has to be joined to the table bodycontent to be updated in some way, or every row of bodycontent has as many update candidates as there are rows in usermigration - the Cartesian product between the two tables.

    There is no way to tell which one will be applied and persist. Both statements are wrong in this regard. With 1000 rows in usermigration and 1000 in bodycontent we have 1 000 000 update candidates before 1000 can be picked.

    Additional tables in the FROM clause hardly ever make sense without a WHERE clause connecting them to the main table.

    The manual:

    When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_item list, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.

    The FROM clause in the UPDATE statement is a PostgreSQL extension to the SQL standard. Other DBMS use different syntax, for instance explicit joins to the table to be updated (in tSQL) that do not work for PostgreSQL.

    Answer to additional question in comment

    This query works, mostly1:

    UPDATE bodycontent b
    SET    body = replace(b.body, u.oldusername, u.newusername)
    FROM   usermigration u
    WHERE  b.body LIKE ('%' || u.oldusername || '%');
    

    1 The outcome is still ambiguous. Multiple matches can be found. It is uncertain which will be applied. The problem is that your requirements are inherently ambiguous. There can be multiple (overlapping) usernames that match and the order in which updates are applied is relevant (but undefined). The UPDATE statement perfectly reflects your flawed requirements.