How to use groupby operation in SFrame, without installing graphlab.
I would love to do some aggregation, but in all examples in the internet I have seen aggregation function comes from Graphlab.
Like:
import graphlab.aggregate as agg
user_rating_stats = sf.groupby(key_columns='user_id',
operations={
'mean_rating': agg.MEAN('rating'),
'std_rating': agg.STD('rating')
})
How can I use, say, numpy.mean and not agg.MEAN in the above example?
The sframe package contains the same aggregation module as the graphlab package, so you shouldn't need to resort to numpy.
import sframe
import sframe.aggregate as agg
sf = sframe.SFrame({'user_id': [1, 1, 2],
'rating': [3.3, 3.6, 4.1]})
grp = sf.groupby('user_id', {'mean_rating': agg.MEAN('rating'),
'std_rating': agg.STD('rating')})
print(grp)
+---------+---------------------+-------------+
| user_id | std_rating | mean_rating |
+---------+---------------------+-------------+
| 2 | 0.0 | 4.1 |
| 1 | 0.15000000000000024 | 3.45 |
+---------+---------------------+-------------+
[2 rows x 3 columns]