Using miller, I would like to know what is the maximal value in a column (which is easy with stats1 -a max
) but I would also like to get the whole row containing that max value.
Let's say I have this data under a data.csv
file:
year,value,country
2000,13,ES
2001,18,IT
2002,16,TZ
2003,14,TZ
2004,10,ES
I would like a miller command to get the max row for each country (so something based on stats1 -a max -f value -g country
):
year,value_max,country
2000,13,ES
2001,18,IT
2002,16,TZ
However mlr --csv stats1 -a max -f value -g country
would only return the value
and country
columns not the date
one.
I would like to do this in a single-pass as my data is quite large.
Thanks!
You could use top verb
mlr --csv top -f value -a -g country input.csv >output.csv
to get
year,value,country
2000,13,ES
2001,18,IT
2002,16,TZ
You have the --min
option to have top smallest values.