sqlssmsdense-rank

DENSE_RANK() when ORDER BY column values are the same


I'm trying to rank a list of payments using DENSE_RANK() function and ordering the results by two data points [PAYROLL DATE] and [PROCESS DATE]. If the [PAYROLL DATE] is null, then the [PROCESS DATE] should be used. If there is more than one row with the same [PAYROLL DATE] they are ranked the same. However, if a [PROCESS DATE] row has the same value as [PAYROLL DATE] they are ranked separately.

I got some help earlier on stackoverflow, but as I was playing around with the data, I am not getting the desired results unless the duplicate [PROCESS DATE] is at the "end" of the data set.

Here is a snippet of what my query says.

DENSE_RANK() OVER (PARTITION BY [Plan_ID],ee_id,loan_number ORDER BY CASE WHEN payroll_date IS NULL THEN 1 ELSE 0 END, payroll_date, process_date ASC)
[Plan ID] [EE ID] [Loan Num] [PAYROLL DATE] [PROCESS DATE] (Desired) RANK (Actual) RANK
aaa 1234 1 11/26/2021 NULL 1 1
aaa 1234 1 NULL 11/26/2021 2 3
aaa 1234 1 02/18/2022 NULL 3 2
aaa 1234 1 02/18/2022 NULL 3 2
aaa 1234 1 NULL 02/18/2022 4 4

In the above example, I want the ranking to use the payroll_date and process_date. What I am seeing is the payroll_date values are being ranked first and then process_date. But I want both payroll_date and process_date to be ordered by date, not ranking all the payroll_date values first and then process_date.

create table table1 (
  plan_id varchar(10), 
  ee_id integer, 
  loan_number integer, 
  payroll_date date, 
  process_date date);

insert into table1 values 
('aaa', 1234, 1, '2021-11-26', null), 
('aaa', 1234, 1, null, '2021-11-26'), 
('aaa', 1234, 1, '2022-02-18', null), 
('aaa', 1234, 1, '2022-02-18', null), 
('aaa', 1234, 1, null, '2022-02-18'); 

SELECT 
*
,PayNum = 
DENSE_RANK() OVER (PARTITION BY [Plan_ID],ee_id,loan_number ORDER BY CASE WHEN payroll_date IS NULL THEN 1 ELSE 0 END, payroll_date, process_date ASC)
FROM table1
ORDER BY ISNULL(payroll_date,process_date)

I've tried different variations of DENSE_RANK() with different ORDER BY statements, CASE statements to no avail. I've also tried RANK() and ROW_NUMBER() and nothing is truly giving me what I'm looking for in the results.


Solution

  • You need to reverse the order by in the partitiion and "join" the columns

    SELECT 
    *
    ,PayNum = 
    DENSE_RANK() OVER (PARTITION BY [Plan_ID],ee_id,loan_number 
      ORDER BY  COALESCE(payroll_date, process_date) , CASE WHEN payroll_date IS NULL THEN 1 ELSE 0 END)
    FROM table1
    ORDER BY ISNULL(payroll_date,process_date)
    
    plan_id ee_id loan_number payroll_date process_date PayNum
    aaa 1234 1 2021-11-26 null 1
    aaa 1234 1 null 2021-11-26 2
    aaa 1234 1 2022-02-18 null 3
    aaa 1234 1 2022-02-18 null 3
    aaa 1234 1 null 2022-02-18 4

    fiddle