sasproc-report

Sorting variables and proc report


I have a dataset of the following form:

GROUP 1  GROUP 2   TOTAL
  A                 400
  A        a1       100
  A        a2       100
  A        a3       300
  B                 300
  B        b1       400
  B        b2       200
  C                 350
  C        c1       100
  C        c2       500

GROUP 1 and GROUP 2 are character variables and TOTAL is a numeric variable. Character variables are sorted alphabetically but not by the variable TOTAL.

I would like to have it sorted within groups (GROUP 1 first) by decreasing frequency (TOTAL variable). If the same groups have the same frequency, then alphabetical order applies. So the output should look like this:

GROUP 1  GROUP 2   TOTAL
  A                 400
  A        a3       300
  A        a1       100
  A        a2       100
  C                 350
  C        c2       500
  C        c1       100
  B                 300
  B        b1       400
  B        b2       200

Is there a quick way of doing this inside proc report procedure without messing with the initial dataset? Or even if this is not possible, is there a quick method to sort it appropriately in an efficient way? The only way coming up to my mind is to sort it separately for every group and then merge the sorted datasets, it takes too much time.


Solution

  • You just need to make sure you have all of the things to sort by on every row. In this case it's just two things you need to add: drop down that total or whatever that is that is on group2=' ' onto every other row for that Group1, and then identify those top rows to keep them up top. Then you can sort it properly.

    PROC REPORT might be able to do this as well with the same want dataset, but without code showing what you're doing it's hard to provide that - but the concept is basically identical.

    data have;
    input GROUP1  $ GROUP2   $ TOTAL;
    datalines;
      A         .        400
      A        a1       100
      A        a2       100
      A        a3       300
      B         .       300
      B        b1       400
      B        b2       200
      C         .       350
      C        c1       100
      C        c2       500
    ;;;;
    run;
    
    data for_sort;
      set have;
      retain total_group;
      if missing(group2) then total_group=total;
      if missing(group2) then topgroup = 1;
      else topgroup = 2;
    run;
    
    proc sort data=for_sort out=want;
      by descending total_group group1 topgroup descending total group2;
    run;