I have come close to counting all occurrences of punctuation, however punctuation characters that are right next to each other get counted as one.
Like so:
cat filename.txt |
tr -sc '[:punct:]' '\n' |
sort |
uniq -c |
sort -bnr`
Which prints something like this:
15 ,
9 !
5 .
2 ;
2 !"
2 '
1 -
1 --
1 :
1 ?
It is clearly only counting punctuation, but how would I separate those that are right next to each other?
This:
tr -sc '[:punct:]' '\n'
Basically what you do here is replace all the non-punctuation characters with \n
. So when there is no such character between two punctuation chars , you get them next to each other
You want something like that:
cat filename.txt | tr -cd [:punct:] | fold -w 1 | sort | uniq -c | sort -bnr