c++sortingvectorcharacterfrequency

Sorting characters in a string first by frequency and then alphabetically


Given a string, I'm trying to count the occurrence of each letter in the string and then sort their frequency from highest to lowest. Then, for letters that have similar number of occurrences, I have to sort them alphabetically.

Here is what I have been able to do so far:

In displaying the frequency count, I just used a for loop starting from the last index to display the result from highest to lowest. I am having problems, however, with regard to those letters having similar frequencies, because I need them displayed in alphabetical order. I tried using a nested for loop with the inner loop starting with the lowest index and using a conditional statement to check if its frequency is the same as the outer loop. This seemed to work, but my problem is that I can't seem to figure out how to control these loops so that redundant outputs will be avoided. To understand what I'm saying, please see this example output:

Enter a string: hello world

Pushing the array into a vector pair v:
d = 1
e = 1
h = 1
l = 3
o = 2
r = 1
w = 1


Sorted first according to frequency then alphabetically:
l = 3
o = 2
d = 1
e = 1
h = 1
r = 1
w = 1
d = 1
e = 1
h = 1
r = 1
d = 1
e = 1
h = 1
d = 1
e = 1
d = 1
Press any key to continue . . .

As you can see, it would have been fine if it wasn't for the redundant outputs brought about by the incorrect for loops.

If you can suggest more efficient or better implementations with regard to my concern, then I would highly appreciate it as long as they're not too complicated or too advanced as I am just a C++ beginner.

If you need to see my code, here it is:

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>

using namespace std;

int main() {
    cout<<"Enter a string: ";
    string input;
    getline(cin, input);

    int letters[26]= {0};

    for (int x = 0; x < input.length(); x++) {
        if (isalpha(input[x])) {
            int c = tolower(input[x] - 'a');
            letters[c]++;
        }
    }

    cout<<"\nPushing the array into a vector pair v: \n";
    vector<pair<int, char> > v;

    for (int x = 0; x < 26; x++) {
        if (letters[x] > 0) {
            char c = x + 'a';
            cout << c << " = " << letters[x] << "\n";
            v.push_back(std::make_pair(letters[x], c));
        }
    }

    // Sort the vector of pairs.
    std::sort(v.begin(), v.end());

    // I need help here!
    cout<<"\n\nSorted first according to frequency then alphabetically: \n";
    for (int x = v.size() - 1 ; x >= 0; x--) {
        for (int y = 0; y < x; y++) {
            if (v[x].first == v[y].first) {
                cout << v[y].second<< " = " << v[y].first<<endl;
            }
        }
        cout << v[x].second<< " = " << v[x].first<<endl;
    }

    system("pause");
    return 0;
}

Solution

  • If you want highest frequency then lowest letter, an easy way would be to store negative values for frequency, then negate it after you sort. A more efficient way would be to change the function used for sorting, but that is a touch trickier:

    struct sort_helper {
       bool operator()(std::pair<int,char> lhs, std::pair<int,char> rhs) const{
         return std::make_pair(-lhs.first,lhs.second)<std::make_pair(-rhs.first,rhs.second);
       }
    };
    std::sort(vec.begin(),vec.end(),sort_helper());