The documents just don't seem to provide an answer..
Microsoft tried to explain the subject clearly, but it is still ambiguous. At least in our case.
We have an encrypted MP4 stream. It contains "SampleEncryptionBox"es or "PIFF" boxes, which contain 8-byte = 64-bit Initialization Vectors for encrypted blocks. BUT: The actual "counter block" for decrypting the "AES-128 Counter Mode"-encrypted video data is 128-bit. I don't know where exactly to put the IV in it!!
PIFF document says 16-byte IV is the entire counter block (obviously) for AES-CTR mode. Also, 8-byte IV is put at the beginning of the counter block, for AES-ECB mode (page 17). But for 8-byte IV in AES-CTR mode, it says nothing!
This RFC document says that the 128-bit should comprise 4-byte Nonce + 8-byte IV + 4-byte counter. And the Nonce value should be taken from the extra 4 bytes supplied for the main 128-bit AES key. I can only obtain the 128-bit key by the Protection Header, where should I get the 4-byte Nonce??
Any bit of extra knowledge will be highly appreciated.
Ok, I found the explanation.. It is written clearly in "ISO/IEC JTC 1/SC 29 N" document.
If the
IV_size
field is 8, then its value is copied to bytes 0 to 7 of theInitializationVector
and bytes 8 to 15 of theInitializationVector
are set to zero. TheIV_size
field shall not be 0 when theIsEncrypted
flag is 0x1.
AES-ECB Mode has nothing to do with it.