I have a bunch of Grouped DataFrames gdf
that I want to combine. I want to combine the GDF with the mean var1 which is a Float and the first element of var2 which is a String.
I tried
combine(gdf, :var1 .=> mean, :var2 .=> first(:var2))
But getting the error ERROR: MethodError: no method matching iterate(::Symbol)
I also tried first(:var2, 1)
.
Thanks for any help.
This is the way to do it with DataFrames.jl:
julia> using DataFrames
julia> using Statistics
julia> df = DataFrame(id=[1,2,1,2,1,2], var1=1.5:1:6.5, var2=string.(1:6))
6×3 DataFrame
Row │ id var1 var2
│ Int64 Float64 String
─────┼────────────────────────
1 │ 1 1.5 1
2 │ 2 2.5 2
3 │ 1 3.5 3
4 │ 2 4.5 4
5 │ 1 5.5 5
6 │ 2 6.5 6
julia> gdf = groupby(df, :id)
GroupedDataFrame with 2 groups based on key: id
First Group (3 rows): id = 1
Row │ id var1 var2
│ Int64 Float64 String
─────┼────────────────────────
1 │ 1 1.5 1
2 │ 1 3.5 3
3 │ 1 5.5 5
⋮
Last Group (3 rows): id = 2
Row │ id var1 var2
│ Int64 Float64 String
─────┼────────────────────────
1 │ 2 2.5 2
2 │ 2 4.5 4
3 │ 2 6.5 6
julia> combine(gdf, :var1 => mean, :var2 => first)
2×3 DataFrame
Row │ id var1_mean var2_first
│ Int64 Float64 String
─────┼──────────────────────────────
1 │ 1 3.5 1
2 │ 2 4.5 2
(there is no need of .
before =>
and no need to pass argument to first
explicitly)
If you would prefer to use assignment style (instead of functional style with =>
pairs) use DataFramesMeta.jl:
julia> using DataFramesMeta
julia> @combine(gdf, :var1_mean=mean(:var1), :var2_first=first(:var2))
2×3 DataFrame
Row │ id var1_mean var2_first
│ Int64 Float64 String
─────┼──────────────────────────────
1 │ 1 3.5 1
2 │ 2 4.5 2