I have a file containing these lines:
"RedfishVersion":"1.6.0"
"RedfishVersion":"1.6.0"
"RedfishVersion":"1.6.0"
"RedfishVersion":"1.6.0"
"RedfishVersion":"1.6.0"
"RedfishVersion":"1.6.0"
"RedfishVersion":"1.15.0"
"RedfishVersion":"1.15.0"
"RedfishVersion":"1.15.0"
"RedfishVersion":"1.15.0"
"RedfishVersion":"1.15.0"
"RedfishVersion":"1.15.0"
"RedfishVersion":"1.15.0"
I was wondering if is there a Unix way to get a histogram percentage of these lines based on how many times it's repeated. This is my attempt:
sort bmc-versions.txt | uniq -cd
321 "RedfishVersion":"1.0.0"
19 "RedfishVersion":"1.0.2"
I want output like this:
"1.0.0" 50%
"1.0.2" 40%
awk 'BEGIN{FS=":"; PROCINFO["sorted_in"] = "@val_num_desc"} {a[$2]++} END{for (i in a) {print i " " int(a[i] / NR * 100 + 0.5) "%"}}' test.txt
"1.15.0" 54 %
"1.6.0" 46 %
Nicer formatting:
awk 'BEGIN {
FS = ":"
PROCINFO["sorted_in"] = "@val_num_desc"
}
{
a[$2]++
}
END {
for (i in a) {
print i " " int(a[i] / NR * 100 + 0.5) "%"
}
}' test.txt
"1.15.0" 54 %
"1.6.0" 46 %
awk 'BEGIN{FS=":"} {a[$2]++} END{for (i=NR; i>=0; i--) {for (h in a) {if(a[h] == i) {print h, int(a[h] / NR * 100 + 0.5), "%"}}}}' test.txt
"1.15.0" 54 %
"1.6.0" 46 %
Nicer formatting:
awk 'BEGIN {
FS = ":"
}
{
a[$2]++
}
END {
for (i = NR; i >= 0; i--) {
for (h in a) {
if (a[h] == i) {
print h, int(a[h] / NR * 100 + 0.5), "%"
}
}
}
}' test.txt
"1.15.0" 54 %
"1.6.0" 46 %