oracle-databasestored-proceduresmacrossas

Efficient macro looping in SAS to get to Oracle Stored Procedure


I'm using SAS to access an Oracle database. The problem is that the function / stored procedure lives on one server in Oracle - which is fine when my data lives there too - but when the data is on a different server I still want to use that function. So I loaded some macros with the personal id's to pass them to the function in a loop. It works, but it's painfully slow. I don't need 'optimal', just 'reasonable'...my datasets will max around 100,000 rows. I've read that creating a dataset is one of the most resource intensive jobs in SAS, so I'm experimenting with creating an empty table and insert into, but I haven't noticed much gain yet.

So the question is - can I use the Oracle stored procedures for data on a different server in a reasonable amount of time within SAS? (Either by improving my existing approach or something completely different)

My first attempt (around 25 minutes for 13,000 personal id's):

%MACRO STATE() ; 
options nosource nonotes; 
%* 2. get macro max loop n; 
proc sql noprint; 
select left(put(count(distinct pidm),10.)) into :loopn from examp
;quit;

%* 3. load macros with the pidms of interest;
proc sql noprint; 
select distinct pidm into :pidm1 - :pidm&loopn from examp order by pidm; 
quit;


%Do i = 1 %TO &loopn ;                            /*build em */
%* %put **************LOOP &i OF &loopn *********************;
proc sql noprint; 
connect to oracle as mycon(user=xxxxxx password=xxxxxxx path='PROD') ; 
create table subsetdat&i as
select * from connection to mycon 
  (select %quote(&&pidm&i) as pidm ,UILIB.ADDR.STATE(&&pidm&i, 'MA') as state  
from dual); 
  disconnect from mycon ; 
 ; quit;

%END;
data state; set subsetdat1-subsetdat&loopn  ;             /*stack 'em */

%Do j = 1 %TO &loopn ;                          /*drop 'em */
proc sql ; 
drop table subsetdat&j
;
%END;
options source notes; 
%MEND STATE ; 

options nomprint; 

%STATE() ; 

Solution

  • Move to loop inside the proc sql, thereby removing the overhead of creating multiple datasets from multiple pass-through queries, and use a union all to 'stack' the individual query results together.

    %MACRO STATE() ; 
    options nosource nonotes; 
    /* 2. get macro max loop n; */
    proc sql noprint; 
    select left(put(count(distinct pidm),10.)) into :loopn from examp
    ;quit;
    
    /* 3. load macros with the pidms of interest; */
    proc sql noprint; 
    select distinct pidm into :pidm1 - :pidm&loopn from examp order by pidm; 
    quit;
    
    /* Build single pass-thru query with multiple select ... union all select ... etc */
    proc sql noprint; 
      connect to oracle as mycon(user=xxxxxx password=xxxxxxx path='PROD') ; 
      create table state as
      select * from connection to mycon 
        (%DO I = 1 %TO &loopn ;                            /*build em */
         select %quote(&&pidm&i) as pidm ,UILIB.ADDR.STATE(&&pidm&i, 'MA') as state from dual
           %IF &I lt &LOOPN %THEN %DO ; /* if not last iteration do a `union all` */
           union all
           %END ;
         %END ;
        ) ;
      disconnect from mycon ; 
    quit;
    options source notes; 
    %MEND STATE ; 
    
    options nomprint; 
    
    %STATE() ;