I have a series of wide panel datasets. In each of these, I want to generate a series of new variables. E.g., in Dataset1, I have variables Car2009 Car2010 Car2011
in a dataset. Using this, I want to create a variable HadCar2009
, which is 1 if Car2009
is non-missing, and 0 if missing, similarly HadCar2010
, and so on. Of course, this is simple to do but I want to do it for multiple datasets which could have different ranges in terms of time. E.g., Dataset2 has variables Car2005
, Car2006
, Car2008
.
These are all very large datasets (I have about 60 such datasets), so I wouldn't want to convert them to long either.
For now, this is what I tried:
forval j = 1/2{
use Dataset`j', clear
forval i=2005/2011{
capture gen HadCar`i' = .
capture replace HadCar`i' = 1 if !missing(Car`i')
capture replace HadCar`i' = 0 if missing(Car`i')
}
save Dataset`j', replace
}
This works, but I am reluctant to use capture
, because perhaps some datasets have a variable called car2008
instead of Car2008
, and this would be an error I would like the program to stop at.
Also, the ranges of years across my 60-odd datasets are different. Ideally, I would like to somehow get this range in a local (perhaps somehow using describe
? I'm not sure) and then just generate these variables using that local with a simple for loop.
But I'm not sure I can do this in Stata.
Your inner loop could be rewritten from
forval i=2005/2011{
capture gen HadCar`i' = .
capture replace HadCar`i' = 1 if !missing(Car`i')
capture replace HadCar`i' = 0 if missing(Car`i')
}
to
foreach v of var Car???? {
gen Had`v' = !missing(`v')
}
noting the fact in Stata that true or false expressions evaluate to 1 or 0 directly.
https://www.stata-journal.com/article.html?article=dm0099
https://www.stata-journal.com/article.html?article=dm0087
https://www.stata.com/support/faqs/data-management/true-and-false/
This code is going to ignore variables beginning with car
. There are other ways to check for their existence. However, if there are no variables Car????
the loop will trigger an error message. A loop over ?ar????
would catch car????
and Car????
(but just possibly other variables too).