pythonencryptionpycryptodome

Python Encryption - Reproducing Error: Data must be padded to 16 byte boundary in CBC mode


I'm having trouble reproducing the following error;

Data must be padded to 16 byte boundary in CBC mode

When I encrypt my data I use the following padding;

BS = 16
pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS)

and the encryption is done with the following code;

encrypt = base64.b64encode(iv + cipher.encrypt(pad(raw).encode('utf8'))).decode()

I have had no issues with the padding or encryption for a good period of time now, but I received this error recently (only once), but I'm not sure how to reproduce it. Can the actual data being passed be the issue? What would trigger this error with 16 byte padding in place?


Solution

  • Padding must be performed on the bytes to be encrypted, not the characters. With utf-8 encoding, some characters are encoded to multiple bytes. For example, consider the two strings

    s1 = chr(0x30)
    s2 = chr(0x80)
    

    Both strings have length 1, but the length of s1.encode('utf-8') will be 1 while the length of s2.encode('utf-8') will be 2. Your algorithm will pad s2 incorrectly.

    Here is a modified pad function that is not so cryptic. You can turn it into a one-liner if you want.

    def pad(s: bytes):
        block_size = 16
        size_of_last_block = len(s) % block_size
        padding_amount = block_size - size_of_last_block
        pad_bytes = bytes([padding_amount] * padding_amount)
        return s + pad_bytes
    

    Note, however, that PyCryptodome already includes pad() and unpad() functions that should normally be used in preference to something home-grown. Example:

    import base64
    import secrets
    
    from Cryptodome.Cipher import AES
    from Cryptodome.Util.Padding import pad, unpad
    
    
    def example():
        key = secrets.token_bytes(32)
        cipher = AES.new(key, AES.MODE_CBC)  # pycryptodome will generate the random IV
        pt = 'Hello World, the secret to success is: Python'.encode('utf-8')
        padded_pt = pad(pt, cipher.block_size, style='pkcs7')
        ct = cipher.encrypt(padded_pt)
        result = base64.b64encode(cipher.iv + ct).decode('utf-8')
        return result, key
    
    def decrypt(key: bytes, encrypted_blob_b64: str):
        encrypted_blob = base64.b64decode(encrypted_blob_b64)
        iv, ct = encrypted_blob[:AES.block_size], encrypted_blob[AES.block_size:]
        cipher = AES.new(key, mode=AES.MODE_CBC, iv=iv)
        padded_pt = cipher.decrypt(ct)
        pt = unpad(padded_pt, cipher.block_size, style='pkcs7')
        return pt.decode('utf8')
    
    result, key = example()
    print(f'len = {len(result)}, result = {result}')
    print(decrypt(key, result))