I have a file that is generated using command | sort | uniq -c
city.txt
2 mumbaiXa
3 mumbaiXb
1 mumbaiXp
5 delhiXn
4 delhiXz
1 parisXs
7 parisXt
1 parisXa
9 parisXe
I am trying to split on X and get the count of each city:
expected output:
mumbai 6
delhi 9
paris 18
I tried this but that did not return the expected result.
grep 'X' city.txt | awk '{print $2}' | awk -F 'X' '{print $1}' | sort | uniq -c
Update:
The data file looks like this...
1904 mumbaiXa
1167 mumbaiXa
830 mumbaiXb
565 mumbaiXp
424 delhiXn
423 delhiXz
I gave a simplified version and changed the text.
I have a file that is generated using command | sort | uniq -c
city.txt 2 mumbaiXa 3 mumbaiXb 1 mumbaiXp 5 delhiXn 4 delhiXz 1 parisXs 7 parisXt 1 parisXa 9 parisXe
If you are allowed to call command again and it will give exactly same output you might get desired totals by dropping X and what is after it, before ramming that into following command, which might be done e.g. following way
command | awk 'BEGIN{FS="X"}{print $1}' | sort | uniq -c
otherwise if you wish to use ... | sort | uniq -c
you should repeat cityname times quantity, let city.txt
content be
2 mumbaiXa
3 mumbaiXb
1 mumbaiXp
5 delhiXn
4 delhiXz
1 parisXs
7 parisXt
1 parisXa
9 parisXe
then
awk 'sub(/X.*/,""){for(i=1;i<=$1;i+=1){print $2}}' city.txt | sort | uniq -c
gives output
9 delhi
6 mumbai
18 paris
Explanation: for every line where subsitution of X followed by zero-or-more of any character was done I use for
loop to print
2nd field number of times specified in 1st field.