[SOLVED] OpenSSL aes-256-gcm tag

OpenSSL aes-256-gcm tag_length and aad?

In OpenSSL encryption in PHP using aes-256-gcm, is the tag_length a value that the coder chooses or is it chosen by the method and returned to the pointer as done with the tag? It reads it can be between 4 and 16. Is the tag_length entered in the encryption function guaranteed to be the length of the tag returned?

Also, how is the aad ("Additional authenticated data") used and why would it be used?

Thank you.

Solution

GCM is authenticated encryption and guarantees confidentiality and authenticity. For the latter an authentication tag is used.

The assurance of authenticity means that the data (ciphertext, IV, AAD) cannot be changed without this being noticed (by the authenticity check via the authentication tag during decryption).

The authentication tag is generated automatically during encryption and can be referenced via $tag. The tag length can be specified with $tag_length and is 16 bytes by default, s. openssl_encrypt(). According to the GCM specification (s. NIST SP 800-38D, sec. 5.2.1.2 Output Data), tag sizes of 16, 15, 14, 13, 12 and in special cases 8 and 4 bytes are permitted (deviating from this, PHP/OpenSSL supports all sizes between 4 and 16 bytes). The greater the tag length, the greater the security.

During decryption, the tag must be specified, s. openssl_decrypt(). The tag (as well as the IV) is not secret and is passed along with the ciphertext (and the IV) to the decrypting side, usually concatenated, e.g. IV|ciphertext|tag.

AAD (additional authenticated data) is data that is authenticated but not encrypted (this could be any information for which you want to ensure that it is not changed, but which is not secret and therefore does not need to be encrypted).

Example (GCM encryption/decryption with 12 bytes tag and with AAD):

$ct = openssl_encrypt('my secret data', 'aes-128-gcm', '0123456789012345', 0, '012345678901', $tag, "my aad", 12); // 12 bytes tag
print('Tag: ' . bin2hex($tag) . PHP_EOL); // Tag: 095c111ecb13b0d411878dfd
$dt = openssl_decrypt($ct, 'aes-128-gcm', '0123456789012345', 0, '012345678901', $tag, "my aad");
print('Decrypted: ' . $dt); // Decrypted: my secret data

Edit - Regarding the questions form the comments:

Why is the tag not secret?

The authentication tag (or MAC) is ultimately a kind of checksum that is generated during encryption from the ciphertext, IV and AAD using a specific algorithm and the key. During decryption, the tag is recalculated and compared with the encryption tag (s. NIST SP 800-38D, sec. 7 GCM Specification).
If the tags from encryption and decryption are identical, authentication is successful. In contrast, if an attacker has manipulated the data (ciphertext and/or IV and/or AAD), a different tag is generated during decryption which reveals the manipulation of the data.
The tag is not secret, since an attacker cannot forge it as the attacker is not in possession of the key. Therefore, the tag can be passed along with the ciphertext to the decrypting side.
Why is the IV not secret and random?

The purpose of the IV is to prevent encryptions of identical plaintexts from generating identical ciphertexts (see also semantic security in ECB mode). This means that a different key/IV pair must be used for each encryption.
With a fixed key (what is the normal case), a different IV must therefore be used for each encryption. Otherwise, as already mentioned, identical plaintexts generate identical ciphertexts and, depending on the mode, can also generate more or less serious vulnerabilities (for instance, the reuse of key/IV pairs for CTR-based modes such as GCM is a serious vulnerability, s. here). A randomly generated IV for each encryption is a common way to generate different key/IV pairs for each encryption.
Since the IV is required for decryption, the algorithms are designed so that the IV does not have to be secret, or, in other words, a known IV does not reduce the security of the algorithm. Therefore, the IV can be passed to the decryption side along with the ciphertext.