sqlssmsdense-rank

DENSE_RANK() with two order by columns with the same value


I'm trying to rank a list of payments using DENSE_RANK() function and ordering the results by two data points [PAYROLL DATE] and [PROCESS DATE]. However, when the two columns in the ORDER BY are the same, the DENSE_RANK() is treating them as the same rank.

Here is a snippet of what my query says.

DENSE_RANK() OVER (PARTITION BY [Plan ID], [EE ID], [Loan Number] 
                   ORDER BY ISNULL([PAYROLL DATE], [PROCESS DATE]) ASC)
[Plan ID] [EE ID] [Loan Num] [PAYROLL DATE] [PROCESS DATE] (Desired) RANK (Actual) RANK
ABC123 1234 1 11/26/2021 NULL 1 1
ABC123 1234 1 12/23/2021 NULL 2 2
ABC123 1234 1 02/18/2022 NULL 3 3
ABC123 1234 1 02/18/2022 NULL 3 3
ABC123 1234 1 NULL 02/18/2022 4 3

In the above example, I want the ranking to go from 1 to 4. What I am seeing is that all three records with dates of 2/18 are being lump together (all ranked as 3). Is there another function I should use to get the results I want?

Thank you in advance!

I've tried different variations of DENSE_RANK() to no avail. I've also tried RANK() and ROW_NUMBER() and nothing is truly giving me what I'm looking for in the results.


Solution

  • One way to using a case expression in your order by clause to manage those null values.

    create table table1 (
      plan_id varchar(10), 
      ee_id integer, 
      loan_number integer, 
      payroll_date date, 
      process_date date);
    
    insert into table1 values 
    ('aaa', 1234, 1, '2021-11-26', null), 
    ('aaa', 1234, 1, '2021-12-23', null), 
    ('aaa', 1234, 1, '2022-02-18', null), 
    ('aaa', 1234, 1, '2022-02-18', null), 
    ('aaa', 1234, 1, null, '2022-02-18'); 
    
    select plan_id, ee_id, loan_number, payroll_date, process_date, 
     DENSE_RANK() OVER 
      (PARTITION BY plan_id,ee_id,loan_number ORDER BY case when payroll_date is null then 1 else 0 end, payroll_date,process_date asc) as r  
    from table1
    
    plan_id ee_id loan_number payroll_date process_date r
    aaa 1234 1 2021-11-26 null 1
    aaa 1234 1 2021-12-23 null 2
    aaa 1234 1 2022-02-18 null 3
    aaa 1234 1 2022-02-18 null 3
    aaa 1234 1 null 2022-02-18 4

    fiddle

    EDIT

    Or, continue using your ISNULL expression but use a far-fetched future date instead of the process_date. HOWEVER, this alternate answer may not provide desired results if you have more than one null payroll_date row with the same partitioned_by columns, but different process_dates. This works fine if null appears just once. Therefore, I would probably stick with the case expression answer.

    select plan_id, ee_id, loan_number, payroll_date, process_date, 
     dense_RANK() OVER 
      (PARTITION BY plan_id,ee_id,loan_number ORDER BY isnull(payroll_date,'9999-12-31') asc) as r  
    from table1