pythonsha

Is there a way to prevent repetition of similar blocks of code in these hashing functions?


I created this program to calculate the sha256 or sha512 hash of a given file and digest calculations to hex.

It consists of 5 files, 4 are custom modules and 1 is the main.

I have two functions in different modules but the only difference in these functions is one variable. See below:

From sha256.py

def get_hash_sha256():
    global sha256_hash
    filename = input("Enter the file name: ")
    sha256_hash = hashlib.sha256()
    with open(filename, "rb") as f:
        for byte_block in iter(lambda: f.read(4096),b""):
            sha256_hash.update(byte_block)
#       print("sha256 valule: \n" + Color.GREEN + sha256_hash.hexdigest())
        print(Color.DARKCYAN + "sha256 value has been calculated")
        color_reset()

From sha512.py

def get_hash_sha512():
    global sha512_hash
    filename = input("Enter the file name: ")
    sha512_hash = hashlib.sha512()
    with open(filename, "rb") as f:
        for byte_block in iter(lambda: f.read(4096),b""):
            sha512_hash.update(byte_block)
#       print("sha512 valule: \n" + Color.GREEN + sha512_hash.hexdigest())
        print(Color.DARKCYAN + "sha512 value has been calculated")
        color_reset()

These functions are called in my simple_sha_find.py file:

def which_hash():
    sha256_or_sha512 = input("Which hash do you want to calculate: sha256 or sha512? \n")
    if sha256_or_sha512 == "sha256":
        get_hash_sha256()
        verify_checksum_sha256()
    elif sha256_or_sha512 == "sha512":
        get_hash_sha512()
        verify_checksum_sha512()
    else:
        print("Type either sha256 or sha512. If you type anything else the program will close...like this.")
        sys.exit()

if __name__ == "__main__":
    which_hash()

As you can see, the functions that will be called are based on the users input. If the user types sha256, then it triggers the functions from sha256.py, but if they type sha512 then they trigger the functions from sha512.py

The application works, but I know I can make it less redundant but I do not know how.

How can I define the get_hash_sha---() and verify_checksum_sha---() functions once and they perform the appropriate calculations based on whether the user chooses sha256 or sha512?

I have performed a few variations of coding this program.

I have created it as one single file as well as creating different modules and calling functions from these modules.

In either case I've had the repetition but I know that tends to defeat the purpose of automation.


Solution

  • You could union these 2 functions into a single one:

    import hashlib
    
    def get_hash(hash_type):
        if hash_type == 'sha256':
            hash_obj= hashlib.sha256()
        elif hash_type == 'sha512':
            hash_obj = hashlib.sha512()
        else:
            print("Invalid hash type.Please choose 'sha256'or'sha512'")
            return
    
        filename = input("Enter the fileename:  ")
        try:
            with open(filename,"rb") as f:
                for byte_block in iter(lambda: f.read(4096), b""):
                    hash_obj.update(byte_block)
            print(Color.DARKCYAN + f"{hash_type} value has been calculated")
            color_reset()
        except FileNotFoundError:
            print(f"File '{filename}' not found.")
    
    def which_hash():
        sha_type =input("Which hash do you want to calculate: sha256 or sha512? \n").lower()
        if sha_type in ['sha256', 'sha512']:
            get_hash(sha_type)
            verify_checksum(sha_type)
        else:
            print("Type sha256 or sha512. If you type anything else program will close. .")
            sys.exit()
    
    if __name__ == "__main__":
        which_hash() 
    

    Also its a best practice to use Enum instead of plain text:

    from enum import Enum
    
    class HashType(Enum):
        SHA256 = 'sha256'
        SHA512 = 'sha512'
    

    So you could change

    if hash_type == HashType.SHA256:
        hash_obj = hashlib.sha256()
    elif hash_type == HashType.SHA512:
        hash_obj = hashlib.sha512()
    
    def which_hash():
        sha_type_input = input("Which hash do you want to calculate: sha256 or sha512? \n").lower()
        
        try:
            sha_type = HashType(sha_type_input)
            get_hash(sha_type)
            verify_checksum(sha_type)
        except ValueError:
            print("Type either sha256 or sha512. If you type anything else the program will close.")
            sys.exit()