c++arrayssortinginsertion-sortinversion

Inversion Checker using Insertion Sort


I need to output a group of letters that are that are out of order with respect to the number of inversions of each other.

For example, the sequence “AACEDGG” has only 1 inversion (E and D) while the sequence “ZWQM” has 6 inversions. I don't actually have to sort it out but I have to output them based on the number of inversions they have.

Ex:

Input: AACATGAAGG TTTTGGCCAA TTTGGCCAAA GATCAGATTT CCCGGGGGGA ATCGATGCAT

Output: CCCGGGGGGA AACATGAAGG GATCAGATTT ATCGATGCAT TTTTGGCCAA TTTGGCCAAA

I am trying to use insertion sort as a template as required by my teacher.

void inversionChecker(string dna[], int n)
{
    int j,k,m;
    int tempCount;
    for(int i=0; i < n; i++){
        int count=0;
        for(j=0;j < n; j++){
            for(k=j+1; k <= n; k++){
                if(dna[i][j] > dna[i][k]){
                    count++;
                    tempCount = count;
                }
            }
        }
        if(i != 0 && tempCount > count)
            dna[i].swap(dna[i-1]);
    }
}

I am having issues because I am not too familiar using 2D arrays to compare the letters in each string. When I try to output the array it ends up being blank, seg faults, or errors resulting from my use trying to swap the positions of the strings in the array.

Any help would be appreciated


Solution

  • Here you access the dna array out-of-bounds:

    for(j=0;j < n; j++){
        for(k=j+1; k <= n; k++){        // when k == n you have undefined behavior
            if(dna[i][j] > dna[i][k])
    

    it should be:

    for(j=0;j < n-1; j++){
        for(k=j+1; k < n; k++){
            if(dna[i][j] > dna[i][k])
    

    An alternative approach using misc. standard classes and algorithms, like std::vector and std::sort.

    #include <algorithm> // copy, sort
    #include <cstddef>   // size_t
    #include <iterator>  // istream_iterator, back_inserter
    #include <sstream>   // istringstream
    #include <string>    // string
    #include <tuple>     // tie
    #include <utility>   // swap
    #include <vector>    // vector
    
    #include <iostream>
    
    // count inversions in a sequence
    unsigned count_inversions(std::string sequence) {
        unsigned res = 0;
    
        // assuming "inversions" are defined as the number of swaps needed in bubblesort
        for(size_t i = 0; i < sequence.size() - 1; ++i) {
            for(size_t j = i + 1; j < sequence.size(); ++j) {
                if(sequence[j] < sequence[i]) {
                    std::swap(sequence[i], sequence[j]);
                    ++res;
                }
            }
        }
        return res;
    }
    
    // a class to store a sequence and its inversion count
    struct sequence_t {
        sequence_t() = default;
    
        explicit sequence_t(const std::string& Seq) :
            seq(Seq), inversions(count_inversions(seq)) {}
    
        // "less than" operator to compare two "sequence_t"s (used in std::sort)
        bool operator<(const sequence_t& rhs) const {
            // assuming lexicographical order if inversions are equal
            return std::tie(inversions, seq) < std::tie(rhs.inversions, rhs.seq);
        }
    
        std::string seq;
        unsigned inversions;
    };
    
    // read one sequence_t from an istream
    std::istream& operator>>(std::istream& is, sequence_t& s) {
        std::string tmp;
        if(is >> tmp) s = sequence_t(tmp);
        return is;
    }
    
    // read "sequence_t"s from an istream and put in a vector<sequence_t>
    auto read_sequences(std::istream& is) {
        std::vector<sequence_t> rv;
    
        std::copy(std::istream_iterator<sequence_t>(is),
                  std::istream_iterator<sequence_t>{}, std::back_inserter(rv));
        return rv;
    }
    
    int main() {
        std::istringstream input(
            "AACATGAAGG TTTTGGCCAA TTTGGCCAAA GATCAGATTT CCCGGGGGGA ATCGATGCAT");
    
        auto sequences = read_sequences(input);
    
        std::sort(sequences.begin(), sequences.end());
    
        // print result
        for(const auto& [seq, inversions] : sequences) {
            std::cout << seq << '(' << inversions << ')' << ' ';
        }
        std::cout << '\n';
    }
    

    Output (including the inversions):

    CCCGGGGGGA(2) AACATGAAGG(10) GATCAGATTT(10) ATCGATGCAT(11) TTTTGGCCAA(12) TTTGGCCAAA(15)