In Stata, I created the average grade in course i of each class in every school in time t (using bysort, egen). Now I have a repeated group variable at a class level repeated across observations. How do I change to a smaller dataset in which the observations consist of each class in a certain school in a certain t, instead of a student in a certain class in a certain school of a certain t? As to not have repeated information i.e. just the average grade of each course mapped to each class.
More concretely, my current output looks something like this(I'm taking the variable time out to make it simpler):
studentid classid avegrade
1 1 14.4
2 1 14.4
3 1 14.4
4 2 16
5 2 16
6 2 16
7 3 13
8 3 13
And I need the following output structure:
classid avegrade
1 14.4
2 16
3 13
Some code that I have done:
sort classid
//This command creates a new variable newid that is 1 for the first observation for each class and missing otherwise.
by classid: gen newid = 1 if _n==1
//replace newid = sum(newid) could be an option in this particular case but under the dynamic timeframe t it won't work
keep if newid =1
The issue is that now all classes are called 1 under the newid.
Following up on Nick's suggestion b/c I know the Stata docs aren't always intuitive for the inexperienced.
This should do the trick:
collapse (mean) avegrade, by(classid)