pythoncryptographytink

How to pass in valid values into cleartext_keyset_json to create a Tink key


In Tink, it is possible to load and write cleartext keysets as jsons. An non-working example is seen below:

{
  "primaryKeyId": 2800579,
  "key": [
    {
      "keyData": {
        "typeUrl": "type.googleapis.com/google.crypto.tink.AesGcmKey",
        "value": "ODA9eJX9wcAGwZocL0Jym==",
        "keyMaterialType": "SYMMETRIC"
      },
      "status": "ENABLED",
      "keyId": 2800579,
      "outputPrefixType": "TINK"
    }
  ]
}

My question is- is it possible to insert your own values into the various key/value pairs to get another valid keyset? I have experimented with this and haven't had much success- mainly because of the "value" key which complains INVALID_ARGUMENT: Could not parse key_data.value as key type 'type.googleapis.com/google.crypto.tink.AesGcmKey' Any idea of what a valid "value" would be?


Solution

  • First of all, the Base64 string of the value field in the posted code snippet is invalid, possibly a copy/paste error.

    The following Python code uses Tink version 1.5.0 and creates and displays a keyset for AES-256/GCM as JSON:

    import io
    from tink import aead
    from tink import tink_config
    from tink import JsonKeysetWriter
    from tink import new_keyset_handle
    from tink import cleartext_keyset_handle
    
    tink_config.register()
    
    key_template = aead.aead_key_templates.AES256_GCM
    keyset_handle = new_keyset_handle(key_template)
    
    string_out = io.StringIO()
    writer = JsonKeysetWriter(string_out)
    cleartext_keyset_handle.write(writer, keyset_handle)
    
    serialized_keyset = string_out.getvalue();
    print(serialized_keyset);
    

    The result is similar to the KeySet you posted and is e.g.:

    {
      "primaryKeyId": 1794775293,
      "key": [
        {
          "keyData": {
            "typeUrl": "type.googleapis.com/google.crypto.tink.AesGcmKey",
            "value": "GiD5ojApaIM2MRpPhGf5sVMhxeA6NE5KjdzUxsJ0ChH/JA==",
            "keyMaterialType": "SYMMETRIC"
          },
          "status": "ENABLED",
          "keyId": 1794775293,
          "outputPrefixType": "TINK"
        }
      ]
    }   
    

    I haven't found a documentation that describes the structure in general or for the value field, but comparing the generated KeySets for different algorithms allows conclusions. If value is hex encoded, the result is:

    1a20f9a23029688336311a4f8467f9b15321c5e03a344e4a8ddcd4c6c2740a11ff24
    

    For AES-256/GCM it has 34 bytes, where the last 32 bytes are the actual key. The beginning is characteristic for the algorithm, the second byte indicates the size of the key, e.g. 0x1a10 for AES-128/GCM, 0x1a20 for AES-256/GCM or 0x1220 for ChaCha20Poly1305 (but can be more complex depending on the algorithm).

    To use a self-defined key for AES-256/GCM, e.g.

    000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f 
    

    prepend 0x1a20, Base64 encode the result:

    GiAAAQIDBAUGBwgJCgsMDQ4PEBESExQVFhcYGRobHB0eHw==
    

    and apply this value instead of the old value in the above KeySet.

    The modified KeySet can be loaded and used for encryption as follows:

    from tink import JsonKeysetReader
    from tink import cleartext_keyset_handle
    
    serialized_keyset = '''
    {
      "primaryKeyId": 1794775293,
      "key": [
        {
          "keyData": {
            "typeUrl": "type.googleapis.com/google.crypto.tink.AesGcmKey",
            "value": "GiAAAQIDBAUGBwgJCgsMDQ4PEBESExQVFhcYGRobHB0eHw==",
            "keyMaterialType": "SYMMETRIC"
          },
          "status": "ENABLED",
          "keyId": 1794775293,
          "outputPrefixType": "TINK"
        }
      ]
    }   
    '''
    reader = JsonKeysetReader(serialized_keyset)
    keyset_handle = cleartext_keyset_handle.read(reader)
    
    plaintext = b'The quick brown fox jumps over the lazy dog'
    aead_primitive = keyset_handle.primitive(aead.Aead)
    tink_ciphertext = aead_primitive.encrypt(plaintext, b'')
    

    The relationship between KeySet and the example key 0001...1e1f can be verified by decrypting the generated ciphertext using the example key without Tink, e.g. with PyCryptodome.

    The format of the Tink ciphertext is described in Tink Wire Format, Crypto Formats. The first byte specifies the version, the next 4 bytes the key ID, followed by the actual data.
    For GCM the actual data has the format nonce (12 bytes) || ciphertext || tag (16 bytes). Decryption is then possible with (using PyCryptodome):

    from Crypto.Cipher import AES
    
    key = bytes.fromhex('000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f')
    
    prefix = tink_ciphertext[:5]
    nonce = tink_ciphertext[5:5 + 12]
    ciphertext = tink_ciphertext[5 + 12:-16]
    tag = tink_ciphertext[-16:]
    
    cipher = AES.new(key, AES.MODE_GCM, nonce=nonce)
    cipher.update(b'')
    decryptedText = cipher.decrypt_and_verify(ciphertext, tag)
    
    print(decryptedText.decode('utf-8')) # The quick brown fox jumps over the lazy dog
    

    which proves that the example key 0001...1e1f was correctly integrated into the KeySet.