In SAS, I have a large table that I want to increment with information from multiple small tables by performing left joins (or equivalent). My logic requires many steps (i.e. can't join everything at the same time). After each join, I want to keep large_table's existing index, making the best use of it. How can I rewrite the following code to accomplish this?
/*Join 1*/
proc sql;
create table large_table as
select a.*, b.newinfo1
from large_table a
left join small_table1 b on a.id = b.id;
quit;
/*some logic*/
/*Join 2*/
proc sql;
create table large_table as
select a.*, b.newinfo2
from large_table a
left join small_table2 b on a.id = b.id;
quit;
/*...*/
Better would certainly be to perform one query. But if you don't have that ability, you have a few options.
The most SAS-like is not a SQL query but a MODIFY statement. This performs a left join, and modifies the master dataset - doesn't replace it. You do have to have all of the variables pre-defined for this to work.
data class(index=(name));
set sashelp.class;
call missing(predict); *define PREDICT so it is available to be updated;
where sex='F';
run;
data classfit(index=(name));
set sashelp.classfit;
run;
data class;
modify class classfit; *only PREDICT will be appended here;
by name;
select (_IORC_); *this processes the 'left' join;
when (%sysrc(_sok)) replace; *if in master then replace;
when (%sysrc(_dsenmr)) delete; *if not in master then delete;
otherwise abort;
end;
run;
proc contents data=class;
run;
You could do something similar in SQL using an UPDATE statement.
proc sql;
update class
set predict = (
select predict from classfit
where class.name=classfit.name
);
quit;
proc contents data=class;
run;