How to use groupby operation in SFrame, without installing graphlab.
I would love to do some aggregation, but in all examples in the internet I have seen aggregation function comes from Graphlab.
Like:
import graphlab.aggregate as agg
user_rating_stats = sf.groupby(key_columns='user_id',
operations={
'mean_rating': agg.MEAN('rating'),
'std_rating': agg.STD('rating')
})
How can I use, say, numpy.mean
and not agg.MEAN
in the above example?
The sframe
package contains the same aggregation module as the graphlab
package, so you shouldn't need to resort to numpy.
import sframe
import sframe.aggregate as agg
sf = sframe.SFrame({'user_id': [1, 1, 2],
'rating': [3.3, 3.6, 4.1]})
grp = sf.groupby('user_id', {'mean_rating': agg.MEAN('rating'),
'std_rating': agg.STD('rating')})
print(grp)
+---------+---------------------+-------------+
| user_id | std_rating | mean_rating |
+---------+---------------------+-------------+
| 2 | 0.0 | 4.1 |
| 1 | 0.15000000000000024 | 3.45 |
+---------+---------------------+-------------+
[2 rows x 3 columns]