I have this table:
data MyTable;
input Household age gender $ flag;
datalines;
1 45 male married
1 35 female married
2 50 female married
2 52 male married
3 32 male child
3 50 male married
3 60 female married
;
I have married couples and other persons in households; no intergenerational or same sex couples. I need to join a husband value to the wife and a wife value to the husband to get this output:
data MyTable2;
input Household age gender $ flag $ age_partner;
datalines;
1 45 male married 35
1 35 female married 45
2 50 female married 52
2 52 male married 50
3 32 male child
3 50 male married 60
3 60 female married 50
;
That does not work; it executes only the t2 part.
proc sql;
create table MyTable2 as
select t1.*, t2.age as age_partner, t3.age as age_partner
from MyTable as t1
left join (select household, age from MyTable where gender = 'male' & flag = 'married') as t2
on t1.household=t2.household & t1.gender = 'female' & t1.flag = 'married'
left join (select household, age from MyTable where gender = 'female' & flag = 'married') as t3
on t1.household=t3.household & t1.gender = 'male' & t1.flag = 'married';
quit;
input Household age gender $ flag $ age_partner;
datalines;
1 45 male married
1 35 female married 45
2 50 female married 52
2 52 male married
3 32 male child
3 50 male married
3 60 female married 50
;
This is the output. It joins the values only to the females.
If I separate these two joins, the last self join rewrites the previous join and in the output only the husbands will have the wife ages and the wives won't have the husband ages.
I can do it in separate columns and after that in the third column I make the needed output.
How do I do it in one proc?
Assuming your logic will hold (I suspect this will not hold for various data quality reasons).
proc sql;
create table want as
select t1.*, case when t1.flag='married' then t2.age else . end as age_partner
from mytable as t1
left join (select * from mytable where flag='married') as t2
on t1.household=t2.household and t1.gender ne t2.gender;
quit;