mysqlsecurityencryptionpii

Protect PII in web app database by encrypting with public key paired with private key protected by users' own passwords?


Goal:

I'd like to allow users to create questions and collect information from other users in a custom web app (PHP/MySQL in a shored hosting environment) and protect the data collected.

Background:

The default questions that all users answer are general enough that they can not be construed as personally identifiable information (PII), thus limiting my liability to protect it, but users who create their own questions will likely ask for PII which then becomes a liability.

What I would like to do is protect this information in such a way that if either the hosting account or the database were compromised (or both!), the PII would not be recoverable without a significant amount of work, and even then, only a small portion would theoretically be recoverable.

Proposed solution:

Assuming MySQL's built-in AES_ENCRYPT()/AES_DECRYPT() functions are used to encrypt the PII table, the passphrase would need to be stored in the hosting account, so if the hosting account were compromised, the data could easily be read.

Since the users' passwords are well protected (hashed with salt), I'm thinking of capturing their plaintext password during authentication, encrypting it, and storing it in the PHP session until the user logs out.

A public/private key combo will be created for each user with the private key being password protected with the user's password + salt.

Then, when PII data based on that user's custom questions are added to the DB, the user's public key would be used to encrypt the PII that they collected through the app. When the data is read (only when the user is logged in), the data would be unencrypted with the user's private key (which is unlocked with their password + salt).

The benefits I see are:

  1. in the worst case scenario, where servers completely compromised, app code is read to find encryption keys, PHP session files are decrypted to find user's passwords, then entries in the PII table associated with that user are decrypted, then only the PII collected from questions of currently logged-in users could be recovered. Any users not logged in would be safe.
  2. even the DBA or similar wouldn't be able to read the PII.

The drawbacks I see are:

  1. user passwords are stored in a recoverable form while they are logged in.
  2. users who forget their passwords would loose access to their data.
  3. each relatively small bit of data will take up much more space in the DB due to encryption.

My question: Is there a better way to do this?


Solution

  • I see a number problems with this design from a security perspective. First of all passwords must never be encrypted, this is a vulnerability identified by CWE-257.

    Further more MySQL's AES_ENCRYPT() is complete garbage for more than one reason. It uses EBC mode, and here is a good example of why this is crap:

    Original Image:

    enter image description here

    EBC Mode (which is what mysql's AES_ENCRYPT() uses):

    enter image description here

    But if the database where compromised the attacker is going to defeat AES_ENCRYPT() by enabling the query log.

    Using the user's password for encryption should be avoided, you should be using a crypgoraphic nonce. If you do use a password make sure you use a String2Key funciton. You must also use CBC or CMAC mode with a random iv. I don't really see how asymmetric cryptography can help. Asymmetric cryptography is very slow, memory intensive. They data that it protects is made less secure when the attacker controls the message because you can compare cipher text messages. This is why an random IV is important, and in the asymmetric world you don't have this level of protection.

    Key Generation should look something like: $key=string2key($base_nonce.$salt.$user_password)

    Make sure the output of your string2key function is the same size as your keyspace. So aes 128 needs a 128bit key. Each password should have its own $salt, and the $base is a cryptographic nonce stored in textfile. (An attacker would have to read this file before he can crack the key, if this value is large like 128 bits then its a moot point.) Each message needs its own $iv and this value must also be a cryptographic nonce (similar to a salt). I would generate the $salt,$iv and $base_nonce from /dev/urandom. The IV can be stored in plain text in a column in your database along with the cipher text.

    From a legal standpoint even if you build a secure cryptogrpahic system you still have problems with insider threats and if the server is completely compromised, all of the data will still be compromised. This really isn't an engineering problem.

    The best defense against a legal threat is a strong Terms and Conditions written by a skilled lawyer.