Suppose I have the following dataset:
data DB;
input ID :$20. Admission :date09. Discharge :date09. morbidity;
format Admission date9. Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 1
0001 13JAN2017 25JAN2017 1
0001 22FEB2018 03MAR2018 0
0001 30JAN2019 04MAR2019 0
0002 01DEC2018 14DEC2018 0
0002 25DEC2018 02JAN2019 1
0002 25NOV2020 03DEC2020 0
0003 09JAN2016 25JAN2016 0
0003 29JAN2018 12FEB2018 0
...;
The morbidity = 1 indicates that the morbidity occurred at that date. If dates are replicated they have both 1 if the morbidity occurred and 0 if not. Is there a way to get the following dataset?
data DB1;
input ID :$20. Admission :date09. Discharge :date09. morbidity new_morbidity;
format Admission date9. Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 1 1
0001 13JAN2017 25JAN2017 1 1
0001 22FEB2018 03MAR2018 0 1
0001 30JAN2019 04MAR2019 0 1
0002 01DEC2018 14DEC2018 0 0
0002 25DEC2018 02JAN2019 1 1
0002 25NOV2020 03DEC2020 0 1
0003 09JAN2016 25JAN2016 0 0
0003 29JAN2018 12FEB2018 0 0
...;
In other words, for each date, for each ID (patient), if the morbidity occurred before the actual date, the actual date should have new_morbidity = 1 if it has morbidity = 0. Otherwise, of course, it will have new_morbidity= 1 if morbidity is already = 1. This row by row, date by date through the history of the patient.
What can I try next? I don't know how to manage the comparison actual date-date before.
Let's assume your data is ordered by ID, admission
. When a patient dies, new_morbidity
should be 1 for all of their records thereafter unless they are Jesus. We can take advantage of by-group processing and the retain
statement to handle this.
We'll do this by having new_morbidity
automatically retained. Since its value is not initialized at each step, we'll give it a value only under certain circumstances. That value will be retained until we decide it needs to be a new value.
If it's the first ID in the group or if morbidity
is 1, set new_morbidity
to that value. This means if morbidity
is a 0 or a 1 for the first value, 0 or 1 will always continue forward in new_morbidity
. If morbidity
flips from a 0 to a 1, we'll also continue keeping new_morbidity
a 1 thereafter for that patient.
data want;
set db;
by id;
retain new_morbidity;
if(first.id or morbidity=1) then new_morbidity = morbidity;
run;
ID Admission Discharge morbidity new_morbidity
0001 13JAN2017 25JAN2017 1 1
0001 13JAN2017 25JAN2017 1 1
0001 22FEB2018 03MAR2018 0 1
0001 30JAN2019 04MAR2019 0 1
0002 01DEC2018 14DEC2018 0 0
0002 25DEC2018 02JAN2019 1 1
0002 25NOV2020 03DEC2020 0 1
0003 09JAN2016 25JAN2016 0 0
0003 29JAN2018 12FEB2018 0 0
For the Jesus edge case, I would just do if(id = 'Jesus') then new_morbidity=0;