I have Dyadic (mother/infant) repeated measures data in long form.
I have three ID variables: Individual ID, Dyad ID, and "status".
ID | DYAD | Status | date | infant weight |
---|---|---|---|---|
001 | 01 | 0 | 01/01 | |
101 | 01 | 1 | 01/01 | 10 |
001 | 01 | 0 | 02/02 | |
101 | 01 | 1 | 02/02 | 20 |
002 | 02 | 0 | 01/01 | |
102 | 02 | 1 | 01/01 | 11 |
002 | 02 | 0 | 02/02 | |
102 | 02 | 1 | 02/02 | 21 |
I want to add infant weight to the mother's rows based on key variables: date and DYAD ID. So final results should look like:
ID | DYAD | Status | date | infant weight |
---|---|---|---|---|
001 | 01 | 0 | 01/01 | 10 |
101 | 01 | 1 | 01/01 | 10 |
Normally, I do it entirely through the GUI by creating a new mini-dataset by 1) selecting only infants (status ==1) and 2) with only key variables and variables of interest, delete infant weight from original and merge data sets: add variables based on key values.
This works fine, but I know there must be a way to do this with syntax.
You can do this by aggregating:
aggregate /outfile=* mode=addvariables overwritevars=yes
/break=DYAD date /infant_weight=max(infant_weight).
Since in any pair of rows with the combined DYAD and Date there will be one row with a value in infant_weight and one row empty, the aggregate will fill the blank row with the maximum value of the pair - which is just the weight from the infant's row.