compressionpkzip

Why pkzip accept two passwords?


I'm trying to do this homework https://www.root-me.org/en/Challenges/Cryptanalysis/File-PKZIP When I write a function to crack it.

import subprocess from time import sleep

file = open('/home/begood/Downloads/SecLists-master/Passwords/'
            'rockyou-75.txt', 'r') lines = file.readlines() file.close() for line in lines:
    command = 'unzip -P ' + line.strip() + ' /home/begood/Downloads/ch5.zip'
    print command
    p = subprocess.Popen(
        command,
        stdout=subprocess.PIPE, shell=True).communicate()[0]
    if 'replace' in p:

        print 'y\n'
    sleep(1)

It stop in password = scooter:

unzip -P scooter /home/begood/Downloads/ch5.zip replace readme.txt?           [y]es, [n]o, [A]ll, [N]one, [r]ename:

but when I use it to unzip it said:

inflating: /home/begood/readme.txt  
  error:  invalid compressed data to inflate

And it real password is : 14535. Why pkzip accept two password?


Solution

  • I presume that the encryption being used is the old, very weak, encryption that was part of the original PKZIP format.

    That encryption method has a 12-byte salt header before the compressed data. From the PKWare specification:

    After the header is decrypted, the last 1 or 2 bytes in Buffer should be the high-order word/byte of the CRC for the file being decrypted, stored in Intel low-byte/high-byte order. Versions of PKZIP prior to 2.0 used a 2 byte CRC check; a 1 byte CRC check is used on versions after 2.0. This can be used to test if the password supplied is correct or not.

    It was originally two bytes in the 1.0 specification, but in the 2.0 specification, and in the associated version of PKZIP, the check value was changed to one byte in order to make password searches like what you are doing more difficult. The result is that about one out of every 256 random passwords will result in passing that first check, and then proceeding to try to decompress the incorrectly decrypted compressed data, only then running into an error.

    So it's far, far more than two passwords that will be "accepted". However it won't take very many bytes of decompressed data to detect that the password was nevertheless incorrect.