I have a column filled with values that need updating because they are too ambiguous and I want to update these values with values from another table columns.
Player Table:
ID | FirstName | |
---|---|---|
1 | Bob.Lance@gmail.com | Bob |
2 | Laura.Davis@gmail.com | Laura |
3 | Josh.Stevens@gmail.com | Josh |
4 | Alex.Wild@gmail.com | Alex |
5 | Bob.Frank@gmail.com | Bob |
6 | Alex.Summers@gmail.com | Alex |
Sports Table:
Sport | Players |
---|---|
Golf | Laura |
Basketball | Josh |
Baseball | Alex |
Football | Bob |
Volleyball | Laura |
Swimming | Alex |
Driving | Bob |
Boxing | Bob |
In this example I am trying to replace values in the Players column with the player's Email because some players have the same first name. I don't believe that there is a way to find out which Bob and Alex are in the Sports Table so I want to do the minimum and change the column values for the names do not have duplicate first names. The updated Sports Table should look like below.
Updated Sports Table:
Sport | Player |
---|---|
Golf | Laura.Davis@gmail.com |
Basketball | Josh.Stevens@gmail.com |
Baseball | Alex |
Football | Bob |
Volleyball | Laura.Davis@gmail.com |
Swimming | Alex |
Driving | Bob |
Boxing | Bob |
My first thought was to find all column values in the Player table that did not have any First Names that showed up more than once and I managed to get that far with the following query below.
SELECT Email, FirstName
FROM PlayerTable
GROUP BY Email, FirstName
HAVING COUNT(P.FirstName) = 1
However, I haven't been able to find a way to use this to update the Sports Table correctly. I tried the following lines of code, but that just outputted an incorrect table.
UPDATE SportsTable
SET Player = (
SELECT P.Email, P.FirstName
FROM PlayerTable P
WHERE P.FirstName = Player
GROUP BY P.Email, P.FirstName
HAVING COUNT(P.FirstName) = 1
)
You are basically on the right track. This query will return all First Names and their email that appear with one email only:
SELECT
FirstName,
MIN(Email) AS Email
FROM Player
GROUP BY
FirstName
HAVING
COUNT(DISTINCT Email) = 1;
So you need to check the distinct count of emails per FirstName rather than count the FirstName itself.
You can use above query in a CTE or in a pure subquery and join it to the Sports table. I think both ways are fine.
Here the option with a CTE:
WITH uniqueMails AS
(SELECT
FirstName,
MIN(Email) AS Email
FROM Player
GROUP BY
FirstName
HAVING
COUNT(DISTINCT Email) = 1)
UPDATE s
SET s.Players = u.Email
FROM Sport s
JOIN uniqueMails u
ON s.Players = u.FirstName;
Verify it works as expected on this db<>fiddle with your sample data.
And here the option without CTE:
UPDATE s
SET s.Players = uniqueMails.Email
FROM Sport s
JOIN
(SELECT
FirstName,
MIN(Email) AS Email
FROM Player
GROUP BY
FirstName
HAVING
COUNT(DISTINCT Email) = 1) uniqueMails
ON s.Players = uniqueMails.FirstName;
db<>fiddle for this option.