sasdata-analysissas-macroproc-sqlenterprise-guide

Flag accounts on the basis of Previous entry


I'm doing a case study for which I need some help. A person is applying for Visa and can apply through any medium. We would like to understand which method did person opt for before coming to Office (Medium of application). I have the information here in the table below - [See attached Image]. There can be chances of multiple applications through other mediums before coming to office but we would like to track only the latest one i.e. method opted just before opting for Office. Example- For the case of A, this person applied 2 times before coming to Office but we would like to know the latest occurence and thus the entry respect to A3 comes next to A2. Please advise what should I do. I have info in yellow but want to get the info in blue.Table with example

Expecting results in blue to be populated. [See attached image]

 data have;
 format apply_date date9.;
 input MasterId $ Id $ Method $ Apply_date date9. Fees ;
 cards;
A A1 Kiosk 25-Feb-22 100
A A2 Office 31-Jan-22 150
A A3 Online 12-Dec-21 200
A A4 Online 03-Dec-21 200
B B1 Office 10-Jan-23 150
B B2 Kiosk 02-Jan-22 100
C C1 Online 09-Nov-22 200
C C2 Kiosk 28-Oct-22 100
C C3 Office 10-Sep-22 150;

Solution

  • You can use LAG to track the earlier method of an office application (when data is sorted ascending datewise).

    Example:

    data want;
      set have;
      by masterid;
    
      previous_method = lag(method);
      if first.masterid then call missing(previous_method);
    
      if method='office' then output;
    run;
    

    Q: What if a person has two consecutive office applications and you want the previous non-office method? If so, a different approach is needed.

    Also, looking more closely at the sample data and your image https://i.sstatic.net/qnmaY.jpg the data is sorted descending date within id group, so the carried information is from a later record in the data set. A lead, not a lag.

    So, within the group the first row with "office" is the latest of that method, and the first non-office after that is the previous.

    A double DOW loop is a good technique for handling this. A DOW loop uses a SET statement inside the loop. The first DOW does a groupwise computation and the second applies the computation. A subtle feature of a step with a DOW loop is that the worker variables do not need to be retained and are automatically reset to missing at the top of the implicit step loop.

    Example (not-tested):

    data want(keep=masterid id method apply_date fees previous:);
      * groupwise computation;
    
      do _n_ = 1 by 1 until (last.masterid);
        set have;
        by masterid;
    
        if method = 'office' and missing(targetid) then 
          targetid = id;
    
        if method ^= 'office' and not missing(targetid) and missing(previous_id) then do;
          p_id = id;
          p_method = method;
          p_date = apply_date;
          p_fees = fees;
        end;
      end;
    
      * apply the computation within the group;
    
      do _n_ = 1 to _n_;
        set have;
        if id = targetid then do;
          previous_id     = p_id;
          previous_method = p_method;
          previous_date   = p_date; format previous_date date9.;
          previous_fees   = p_fees;
    
          output;
          call missing (of previous_:)
        end;
        else
          output;
        end;
      end;
    run;