I have a panel dataset (country-year) in Stata. For instance I have GDP in 1990,1991,..2010 for many countries.
I want to define a variable "GDP in 2006" which exists for all years and contains the 2006 value of GDP.
The way I am doing it now works but is a but clumsy so I was hoping someone would have a better idea:
qui gen gdp2006=.
replace gdp2006=gdp if year==2006
forval t=2007/2010 {
sort country year
qui replace gdp2006=gdp2006[_n-1] if year==`t'&country[_n-1]==country
}
forval t=2005(-1)1990 {
sort country year
qui replace gdp2006=gdp2006[_n+1] if year==`t'&country[_n+1]==country
}
Thanks!
You can do this in one line
egen gdp2006 = mean(gdp / (year == 2006)), by(country)
(year == 2006)
evaluates as 1 or 0, so the expression
gdp / (year == 2006)
evaluates as gdp
when year
is 2006 and missing otherwise. Missings are ignored in calculating the mean for each country.
For a wider and more systematic discussion see http://www.stata-journal.com/article.html?article=dm0055
P.S. The techniques you know permit shortening of your code:
gen gdp2006 = gdp if year == 2006
bysort country (gdp2006): replace gdp2006 = gdp2006[_n-1] if _n > 1