sas

proc sort with nodupkey based on the value of a variable


suppose to have the following dataset:

data DB;
  input ID: $20. Admission :date09. Discharge :date09. Index;
  format Admission date09. Discharge date09.;
cards;
0001 13JAN2019 13JUL2019  1 
0001 13JAN2019 13JUL2019  0 
0002 11MAY2023 12JUN2023  1
0002 01DEC2023 22DEC2023  0
0002 01DEC2023 22DEC2023  0
0003 13JAN2014 17JAN2014  1
0003 23APR2015 28APR2015  0
0003 23APR2015 28APR2015  0
run;

Is there a way to get the following output?

data DB1;
  input ID: $20. Admission :date09. Discharge :date09. Index;
  format Admission date09. Discharge date09.;
cards;
0001 13JAN2019 13JUL2019  1 
0001 13JAN2019 13JUL2019  0 
0002 11MAY2023 12JUN2023  1
0002 01DEC2023 22DEC2023  0
0003 13JAN2014 17JAN2014  1
0003 23APR2015 28APR2015  0
run;


In other words I would like to retain one of the duplicated rows but only when Index = 0. For ID 0001 since Index is 1 but also 0 for duplicated rows nothing will happen. Instead, for ID 0002 for example I would like to retain one time the row corresponding to dates: 01DEC2023-22DEC2023. I tried with proc sort nodupkey but I don't know how to specify, for example with "where", the condition on the index value.

Thank you in advance


Solution

  • Try this

    proc sort data = db;
       by ID Admission Discharge;
    run;
    
    data want;
       set db;
       by ID Index notsorted;
       if first.Index;
    run;
    

    Result:

    ID    Admission  Discharge  Index
    0001  13JAN2019  13JUL2019  1
    0001  13JAN2019  13JUL2019  0
    0002  11MAY2023  12JUN2023  1
    0002  01DEC2023  22DEC2023  0
    0003  13JAN2014  17JAN2014  1
    0003  23APR2015  28APR2015  0