c++securityobfuscationdefensive-programming

Techniques for obscuring sensitive strings in C++


I need to store sensitive information (a symmetric encryption key that I want to keep private) in my C++ application. The simple approach is to do this:

std::string myKey = "mysupersupersecretpasswordthatyouwillneverguess";

However, running the application through the strings process (or any other that extracts strings from a binary app) will reveal the above string.

What techniques should be used to obscure such sensitive data?

Edit:

OK, so pretty much all of you have said "your executable can be reverse engineered" - of course! This is a pet peeve of mine, so I'm going to rant a bit here:

Why is it that 99% (OK, so perhaps I exaggerate a little) of all security-related questions on this site are answered with a torrent of "there is no possible way to create a perfectly secure program" - that is not a helpful answer! Security is a sliding scale between perfect usability and no security at one end, and perfect security but no usability at the other.

The point is that you pick your position on that sliding scale depending on what you're trying to do and the environment in which your software will run. I'm not writing an app for a military installation, I'm writing an app for a home PC. I need to encrypt data across an untrusted network with a pre-known encryption key. In these cases, "security through obscurity" is probably good enough! Sure, someone with enough time, energy and skill could reverse-engineer the binary and find the password, but guess what? I don't care:

The time it takes me to implement a top-notch secure system is more expensive than the loss of sales due to the cracked versions (not that I'm actually selling this, but you get my point). This blue-sky "lets do it the absolute best way possible" trend in programming amongst new programmers is foolish to say the least.

Thank you for taking the time to answer this question - they were most helpful. Unfortunately I can only accept one answer, but I've up-voted all the useful answers.


Solution

  • Basically, anyone with access to your program and a debugger can and will find the key in the application if they want to.

    But, if you just want to make sure the key doesn't show up when running strings on your binary, you could for instance make sure that the key is not within the printable range.

    Obscuring key with XOR

    For instance, you could use XOR to split the key into two byte arrays:

    key = key1 XOR key2
    

    If you create key1 with the same byte-length as key you can use (completely) random byte values and then compute key2:

    key1[n] = crypto_grade_random_number(0..255)
    key2[n] = key[n] XOR key1[n]
    

    You can do this in your build environment, and then only store key1and key2 in your application.

    Protecting your binary

    Another approach is to use a tool to protect your binary. For instance, there are several security tools that can make sure your binary is obfuscated and starts a virtual machine that it runs on. This makes it hard(er) to debug, and is also the convential way many commercial grade secure applications (also, alas, malware) is protected.

    One of the premier tools is Themida, which does an awesome job of protecting your binaries. It is often used by well known programs, such as Spotify, to protect against reverse engineering. It has features to prevent debugging in programs such as OllyDbg and Ida Pro.

    There is also a larger list, maybe somewhat outdated, of tools to protect your binary.
    Some of them are free.

    Password matching

    Someone here discussed hashing password+salt.

    If you need to store the key to match it against some kind of user submitted password, you should use a one-way hashing function, preferrably by combining username, password and a salt. The problem with this, though, is that your application has to know the salt to be able to do the one-way and compare the resulting hashes. So therefore you still need to store the salt somewhere in your application. But, as @Edward points out in the comments below, this will effectively protect against a dictionary attack using, e.g, rainbow tables.

    Finally, you can use a combination of all the techniques above.