sas

Update actual records of a variable based on the date before


Suppose I have the following dataset:

data DB;
  input ID :$20. Admission :date09. Discharge :date09. morbidity;
  format Admission date9. Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  1 
0001  13JAN2017 25JAN2017  1
0001  22FEB2018 03MAR2018  0   
0001  30JAN2019 04MAR2019  0  
0002  01DEC2018 14DEC2018  0 
0002  25DEC2018 02JAN2019  1 
0002  25NOV2020 03DEC2020  0 
0003  09JAN2016 25JAN2016  0 
0003  29JAN2018 12FEB2018  0 
...;

The morbidity = 1 indicates that the morbidity occurred at that date. If dates are replicated they have both 1 if the morbidity occurred and 0 if not. Is there a way to get the following dataset?

data DB1;
  input ID :$20. Admission :date09. Discharge :date09. morbidity new_morbidity;
  format Admission date9. Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  1  1 
0001  13JAN2017 25JAN2017  1  1
0001  22FEB2018 03MAR2018  0  1   
0001  30JAN2019 04MAR2019  0  1  
0002  01DEC2018 14DEC2018  0  0 
0002  25DEC2018 02JAN2019  1  1 
0002  25NOV2020 03DEC2020  0  1 
0003  09JAN2016 25JAN2016  0  0 
0003  29JAN2018 12FEB2018  0  0 
...;

In other words, for each date, for each ID (patient), if the morbidity occurred before the actual date, the actual date should have new_morbidity = 1 if it has morbidity = 0. Otherwise, of course, it will have new_morbidity= 1 if morbidity is already = 1. This row by row, date by date through the history of the patient.

What can I try next? I don't know how to manage the comparison actual date-date before.


Solution

  • Let's assume your data is ordered by ID, admission. When a patient dies, new_morbidity should be 1 for all of their records thereafter unless they are Jesus. We can take advantage of by-group processing and the retain statement to handle this.

    We'll do this by having new_morbidity automatically retained. Since its value is not initialized at each step, we'll give it a value only under certain circumstances. That value will be retained until we decide it needs to be a new value.

    If it's the first ID in the group or if morbidity is 1, set new_morbidity to that value. This means if morbidity is a 0 or a 1 for the first value, 0 or 1 will always continue forward in new_morbidity. If morbidity flips from a 0 to a 1, we'll also continue keeping new_morbidity a 1 thereafter for that patient.

    data want;
        set db;
        by id;
        retain new_morbidity;
    
        if(first.id or morbidity=1) then new_morbidity = morbidity;
    run;
    
    ID      Admission   Discharge   morbidity   new_morbidity
    0001    13JAN2017   25JAN2017   1           1
    0001    13JAN2017   25JAN2017   1           1
    0001    22FEB2018   03MAR2018   0           1
    0001    30JAN2019   04MAR2019   0           1
    0002    01DEC2018   14DEC2018   0           0
    0002    25DEC2018   02JAN2019   1           1
    0002    25NOV2020   03DEC2020   0           1
    0003    09JAN2016   25JAN2016   0           0
    0003    29JAN2018   12FEB2018   0           0
    

    For the Jesus edge case, I would just do if(id = 'Jesus') then new_morbidity=0;