sqlsas

SAS generate list of nCr text values


Suppose I have a SAS dataset name_list which contains a unique list of names

name
A
B
C
D
E
F

Some code which could generate the above example dataset:

data name_list;
name = "A";
output;
name = "B";
output;
name = "C";
output;
name = "D";
output;
name = "E";
output;
name = "F";
output;
run;

I am looking to create a dataset which contains a list of all unique possible combinations of some number of names.

%macro create_name_combination(names = , choose_num = );

 data temp;
 set &names.;
 INDEX = _N_;
 run;
 proc sql noprint;
 select 
 quit;

 proc sql noprint;
 select max(INDEX)
 into: Name_Count
 from temp;
 quit;

 %if &choose_num <= &Name_Count. %then %do;
  /* This is where I am stuck - I need some help writing code which will generate a dataset which outputs a list of all combinations of choose_num names, without repetition, from names. */
 %end;

%mend;

%create_name_combination(names = name_list, choose_num = 3);

I am looking for a dataset output looking something like the below

want:

 Combination
 A , B , C
 A , B , D
 A , B , E
 A , B , F
 B , C , D
 B , C , E
 B , C , F
 C , D , E
 C , D , F
 ...

Can somebody please assist?


Solution

  • SAS has a function for generating combinations, CALL ALLCOMB(). And another for knowing how many combinations there will be, COMB().

    Here is an easier way to create sample data, using the INPUT statement and in-line lines of data.

    data have;
      input name $ @@;
    cards;
    A B C D E F
    ;
    

    First transpose the data so that each value is in a separate variable.

    proc transpose data=have out=wide prefix=name;
     var name;
    run;
    

    Now you are ready to use CALL ALLCOMB() to make the combinations.

    %let choose_num=3 ;
    data want ;
      set wide ;
      array names name: ;
      do i=1 to comb(dim(names), &choose_num);
        call allcomb(i, &choose_num, of names[*]);
        output;
      end;
      keep name1-name&choose_num;
    run;
    

    If you really want the values in a single long character variable instead then use the CATX() function.

    data want ;
      set wide ;
      array names name: ;
      do i=1 to comb(dim(names), &choose_num);
        call allcomb(i, &choose_num, of names[*]);
        length combinations $30;
        combinations=catx(',',of name1-name&choose_num);
        output;
      end;
      keep combinations;
    run;