sqlsql-serversql-updatemultiple-tables

Update values from a column of one table if they exist in another table's column if these values are unique


I have a column filled with values that need updating because they are too ambiguous and I want to update these values with values from another table columns.

Player Table:

ID Email FirstName
1 Bob.Lance@gmail.com Bob
2 Laura.Davis@gmail.com Laura
3 Josh.Stevens@gmail.com Josh
4 Alex.Wild@gmail.com Alex
5 Bob.Frank@gmail.com Bob
6 Alex.Summers@gmail.com Alex

Sports Table:

Sport Players
Golf Laura
Basketball Josh
Baseball Alex
Football Bob
Volleyball Laura
Swimming Alex
Driving Bob
Boxing Bob

In this example I am trying to replace values in the Players column with the player's Email because some players have the same first name. I don't believe that there is a way to find out which Bob and Alex are in the Sports Table so I want to do the minimum and change the column values for the names do not have duplicate first names. The updated Sports Table should look like below.

Updated Sports Table:

Sport Player
Golf Laura.Davis@gmail.com
Basketball Josh.Stevens@gmail.com
Baseball Alex
Football Bob
Volleyball Laura.Davis@gmail.com
Swimming Alex
Driving Bob
Boxing Bob

My first thought was to find all column values in the Player table that did not have any First Names that showed up more than once and I managed to get that far with the following query below.

SELECT Email, FirstName
FROM PlayerTable
GROUP BY Email, FirstName
HAVING COUNT(P.FirstName) = 1

However, I haven't been able to find a way to use this to update the Sports Table correctly. I tried the following lines of code, but that just outputted an incorrect table.

UPDATE SportsTable
SET Player = (
    SELECT P.Email, P.FirstName
    FROM PlayerTable P
    WHERE P.FirstName = Player
    GROUP BY P.Email, P.FirstName
    HAVING COUNT(P.FirstName) = 1
)

Solution

  • You are basically on the right track. This query will return all First Names and their email that appear with one email only:

    SELECT 
      FirstName,
      MIN(Email) AS Email
    FROM Player
    GROUP BY 
      FirstName
    HAVING 
      COUNT(DISTINCT Email) = 1;
    

    So you need to check the distinct count of emails per FirstName rather than count the FirstName itself.

    You can use above query in a CTE or in a pure subquery and join it to the Sports table. I think both ways are fine.

    Here the option with a CTE:

    WITH uniqueMails AS
    (SELECT 
      FirstName,
      MIN(Email) AS Email
    FROM Player
    GROUP BY 
      FirstName
    HAVING 
      COUNT(DISTINCT Email) = 1)
    UPDATE s
    SET s.Players = u.Email
    FROM Sport s
    JOIN uniqueMails u
    ON s.Players = u.FirstName;
    

    Verify it works as expected on this db<>fiddle with your sample data.

    And here the option without CTE:

    UPDATE s
    SET s.Players = uniqueMails.Email
    FROM Sport s
    JOIN 
    (SELECT 
      FirstName,
      MIN(Email) AS Email
    FROM Player
    GROUP BY 
      FirstName
    HAVING 
      COUNT(DISTINCT Email) = 1) uniqueMails
    ON s.Players = uniqueMails.FirstName;
    

    db<>fiddle for this option.