I have a procedure where I am trying to fetch the only on row of data per foreign key (FK) per (PK) in a table (turning a one-to-many relationship into a one-to-one table).
I have some tables that looks like below. NOTE: my tables do not actually look like this, these are just examples.
COMPANY
CompanyNo_PK | CompanyName |
---|---|
1 | Foo |
2 | Bar |
3 | FooBar |
JOB ROLE
JobRoleNo_PK | RoleName |
---|---|
1 | Manager |
2 | IT |
3 | Clerk |
EMPLOYEE
EmployeeNo_PK | EmployeeName | CompanyNo_FK | JobRoleNo_FK | DateHired |
---|---|---|---|---|
1 | Steve | 1 | 1 | 1/3/2018 |
2 | Bob | 2 | 1 | 1/9/2021 |
3 | Daisy | 3 | 1 | 3/7/2020 |
4 | John | 1 | 2 | 2/9/2019 |
5 | Sam | 1 | 3 | 5/17/2022 |
6 | Kathy | 1 | 1 | 6/25/2022 |
7 | Cait | 3 | 3 | 9/6/2020 |
8 | Adam | 3 | 1 | 8/3/2021 |
9 | Mario | 2 | 2 | 4/30/2020 |
My procedure should output a table using the tables above to only see the MOST CURRENT Manager for each Company
.
So the output should look something like this:
CURRENT COMPANY MANAGERS
CompanyName | EmployeeName | DateHired |
---|---|---|
Foo | Kathy | 6/25/2022 |
Bar | Bob | 1/9/2021 |
FooBar | Adam | 8/3/2021 |
I tried the following query below, but I did not get the result I wanted.
CREATE PROCEDURE [Current Company Managers]
AS
SELECT DISTINCT
c.[CompanyName], e.[EmployeeName], e.[DateHired]
FROM
[Company] c
LEFT JOIN
[Employee] e ON e.CompanyNo_FK = c.CompanyNoPK
AND JobRoleNo_FK = 1
AND (NOT EXISTS (SELECT 1
FROM [Employee] ee
WHERE c.CompanyNo_PK = ee.CompanyNo_FK
AND ee.DateHired > e.DateHired)
OR NOT EXISTS (SELECT 1
FROM [Employee] ee
WHERE c.CompanyNo_PK = ee.CompanyNo_FK
AND ee.DateHired < e.DateHired))
I included the OR NOT EXISTS
operation since there were cases where some current managers were not being listed in the view
when they should be or the managers being listed were incorrect (mainly the older entries).
However, even with that, the results would returns the wrong entries:
CompanyName | EmployeeName | DateHired |
---|---|---|
Foo | Kathy | 6/25/2022 |
Bar | Bob | 1/9/2021 |
FooBar | Daisy | 3/7/2020 |
I have multiple entries per company:
CompanyName | EmployeeName | DateHired |
---|---|---|
Foo | Kathy | 6/25/2022 |
Bar | Bob | 1/9/2021 |
FooBar | Adam | 8/3/2021 |
FooBar | Daisy | 3/7/2020, |
but I'm also missing some entries:
CompanyName | EmployeeName | DateHired |
---|---|---|
Bar | Bob | 1/9/2021 |
FooBar | Adam | 8/3/2021 |
or some combination of the three. So explanation for why this is happening will be much appreciated.
You want exactly one employee per company, and only the most recent manager hired. You can figure out the latter two using a TOP (1)
inside a correlated APPLY
.
SELECT c.CompanyName, e.* FROM Company AS c
OUTER APPLY
(SELECT TOP (1) EmployeeName, DateHired
FROM Employee AS e
INNER JOIN JobRole AS j
ON e.JobRoleNo_FK = j.JobRoleNo_PK
WHERE j.RoleName = N'Manager'
AND e.CompanyNo_FK = c.CompanyNo_PK
ORDER BY DateHired DESC) AS e;
Output:
CompanyName | EmployeeName | DateHired |
---|---|---|
Foo | Kathy | 2022-06-25 |
Bar | Bob | 2021-01-09 |
FooBar | Adam | 2021-08-03 |
This will include companies that don't have any managers (if you want to filter those out, use CROSS APPLY
instead of OUTER APPLY
).
If you know the job role number already, or have already passed it in from somewhere, you can simplify a little:
SELECT c.CompanyName, e.* FROM Company AS c
OUTER APPLY (SELECT TOP (1) EmployeeName, DateHired
FROM Employee AS e
WHERE e.JobRoleNo_FK = 1
AND e.CompanyNo_FK = c.CompanyNo_PK
ORDER BY DateHired DESC) AS e;
You can use a CTE approach too, I just find it a lot more complicated to write:
WITH cte AS
(
SELECT c.CompanyName, e.EmployeeName, e.DateHired,
LatestRow = ROW_NUMBER() OVER
(PARTITION BY c.CompanyNo_PK
ORDER BY e.DateHired DESC)
FROM Company AS c
LEFT OUTER JOIN Employee AS e
ON c.CompanyNo_PK = e.CompanyNo_FK
AND e.JobRoleNo_FK = 1 -- or @JobRoleNo param
)
SELECT CompanyName, EmployeeName, DateHired
FROM cte
WHERE LatestRow = 1;