sasretain

Sum consecutive observations in a dataset SAS


I have a dataset that looks like:

  Hour    Flag
    1       1
    2       1
    3       .
    4       1
    5       1
    6       .
    7       1
    8       1 
    9       1
    10      . 
    11      1
    12      1
    13      1
    14      1

I want to have an output dataset like:

   Total_Hours   Count
        2          2
        3          1
        4          1

As you can see, I want to count the number of hours included in each period with consecutive "1s". A missing value ends the consecutive sequence.

How should I go about doing this? Thanks!


Solution

  • You'll need to do this in two steps. First step is making sure the data is sorted properly and determining the number of hours in a consecutive period:

    PROC SORT DATA = <your dataset>;
      BY hour;
    RUN;
    
    DATA work.consecutive_hours;
      SET <your dataset> END = lastrec;
    
      RETAIN
        total_hours 0
      ;
    
      IF flag = 1 THEN total_hours = total_hours + 1;
      ELSE
        DO;
          IF total_hours > 0 THEN output;
          total_hours = 0;
        END;
      /* Need to output last record */
      IF lastrec AND total_hours > 0 THEN output;
    
      KEEP 
        total_hours
      ;
    RUN;
    

    Now a simple SQL statement:

    PROC SQL;
      CREATE TABLE work.hour_summary AS
      SELECT
        total_hours
       ,COUNT(*) AS count
      FROM
        work.consecutive_hours
      GROUP BY
        total_hours
      ;
    QUIT;