I'm using census data across multiple years. For each year, the dataset is laid out completely the same in terms of structure and content (i.e. the same questions are asked of respondents and their answers are laid out in the same way across years). I want to import multiple datasets, apply changes to each, and then combine them.
To illustrate what I mean, I imported the 2018 data (the frame is titled 'cps18') and removed three specific rows. I then used frame change
and imported the 2019 data, called 'cps19,' and applied similar changes. When I use append using cps18
, the console returns "file cps18 not found."
From what I can find online, it seems that the append
command is used for combining datasets on the disk with the frame in memory. But what if I have two or more frames in memory? Is there a way to combine them?
past self! You can write a loop to perform the same actions across multiple files. Let's say you have cps17.csv, cps18.csv, and cps19.csv. We can write a loop to import each file, make the necessary changes, and then append:
cwf default // revert to default frame
clear
tempfile cpsdata
save `cpsdata', replace empty
forval y = 17/19 {
import delimited cps`y'.csv
g newvar1 = oldvar1 + oldvar2 // sample change
destring year, replace // sample change
append using `cpsdata', force
save `cpsdata', replace
}