c++fileoperating-systemhamming-code

I wrote this Hamming Encoding code for class. Why is it so slow?


I wrote this for my OS class:

#include <iostream>
#include <fstream>

//encodes a file using the (8,4) Hamming Code.
//usage : HammingEncode.out < inputFile > outputFile 
int main() {
    unsigned char const codebook[] = {0x00, 0x1E, 0x2D, 0x33, 0x4B, 0x55, 0x66, 0x78, 0x87, 0x99, 0xAA, 0xB4, 0xCC, 0xD2, 0xE1, 0xFF};
    unsigned char in, nextByte;
    unsigned char const leftMask = 0xF0, rightMask = 0x0F;

    in = std::cin.get();
    while (!std::cin.eof()) {
        nextByte = (in & leftMask) >> 4;
        std::cout << codebook[nextByte];
        nextByte = in & rightMask;
        std::cout << codebook[nextByte];
        in = std::cin.get();
    }
}

I then decided to test its speed (just to see) on the Old Testamenet form the King James Bible. This was our standard test file for a Data Structures Class which was taught in Java, and we could sort it and Huffman Encode it in basically no time, yet this takes quite a long time to encode. What is going on?


Solution

  • std::cin is open in text-mode, and as such it is constantly on the lookout for all kinds of things to watch for (like newlines, etc).

    Given the constant character sniffing by the std::cin input stream, i'm not surprised it takes longer, but it does seem a little excessive. the following, bypassing iostream and using FILE stream directly will probably do what you were expecting:

    #include <cstdlib>
    #include <cstdio>
    
    int main(int argc, char *argv[])
    {
        static unsigned char const codebook[] =
        {
            0x00, 0x1E, 0x2D, 0x33, 0x4B, 0x55, 0x66, 0x78,
            0x87, 0x99, 0xAA, 0xB4, 0xCC, 0xD2, 0xE1, 0xFF
        };
    
        for (int c = std::fgetc(stdin); c!=EOF; c=std::fgetc(stdin))
        {
            std::fputc(codebook[c >> 4], stdout);
            std::fputc(codebook[c & 0x0F], stdout);
        }
    
        return EXIT_SUCCESS;
    }
    

    I tested the exact code above on an 10MB random file loaded with chars ranging from a to z, and the results were ridiculously long when using std::cin and std::cout. Using the FILE streams directly, the difference was enormous. All of the code in this answer was tested with Apple LLVM version 5.1 (clang-503.0.38) (based on LLVM 3.4svn) using -O3 optimization.

    Using FILE streams

    time ./hamming < bigfile.txt > bigfile.ham 
    real 0m1.855s
    user 0m1.812s
    sys  0m0.041s
    

    Using std::cin and std::cout

    time ./hamming < bigfile.txt > bigfile.ham
    real 0m23.819s
    user 0m7.416s
    sys  0m16.377s
    

    Using std::cin and std::cout with std::cout.sync_with_stdio(false);

    time ./hamming < bigfile.txt > bigfile.ham
    real 0m24.867s
    user 0m7.705s
    sys  0m17.118s
    

    In summary, ouch. Of note is the time spent in system. If I get a chance to update this with the std::istream::get() and put() methods I will, but honestly I don't expect any miracles on that. Unless there is some magic (to me, not to others) way of turning off io xlat from std::cin the FILE streams may be a reasonable alternative. I haven't yet investigated whether slurping std::cin's rdbuf() is a viable option either, but it may too have promise.


    Edit: Using std::istreambuf_iterator<char>

    Using the streambuf iterator class has a significant improvement, as it essentially bypasses all the inline slat junk, but it still isn't as efficient as FILE streams:

    #include <iostream>
    #include <cstdlib>
    #include <cstdio>
    
    int main(int argc, char *argv[])
    {
        static unsigned char const codebook[] =
        {
            0x00, 0x1E, 0x2D, 0x33, 0x4B, 0x55, 0x66, 0x78,
            0x87, 0x99, 0xAA, 0xB4, 0xCC, 0xD2, 0xE1, 0xFF
        };
    
        std::istreambuf_iterator<char> cin_it(std::cin), cin_eof;
        std::for_each(cin_it, cin_eof, [](char c)
            {
              std::cout.put(static_cast<char>(codebook[static_cast<unsigned char>(c) >> 4]));
              std::cout.put(static_cast<char>(codebook[static_cast<unsigned char>(c) & 0x0F]));
            });
    
        return EXIT_SUCCESS;
    }
    

    Results:

    time ./hamming < bigfile.txt > bigfile.ham
    
    real 0m6.062s
    user 0m5.795s
    sys  0m0.053s
    

    Note that system is now comparable to the FILE stream results, but the overhead from rest of the iostream templates in user seems a sore point (but still better than the other iostream attempts). You win some, you lose some =P


    Edit: Unbuffered System IO

    In effort to be completely fair, bypassing all runtime buffering and relying solely on the system calls to do this madness, the following is also worth noting:

    #include <cstdlib>
    #include <cstdio>
    #include <unistd.h>
    
    int main(int argc, char *argv[])
    {
        static unsigned char const codebook[] =
        {
            0x00, 0x1E, 0x2D, 0x33, 0x4B, 0x55, 0x66, 0x78,
            0x87, 0x99, 0xAA, 0xB4, 0xCC, 0xD2, 0xE1, 0xFF
        };
    
        unsigned char c;
        while (read(STDIN_FILENO, &c, 1)> 0)
        {
            unsigned char duo[2] =
            {
                codebook[ c >> 4 ],
                codebook[ c & 0x0F ]
            };
            write(STDOUT_FILENO, duo, sizeof(duo));
        }
    
        return EXIT_SUCCESS;
    }
    

    The results, as you would expect, were dreadful:

    time ./hamming < bigfile.txt > bigfile.ham
    
    real 0m26.509s
    user 0m2.370s
    sys  0m24.087s