I have a set of files and a SHA256SUMS
digest file that contains a sha256()
hash for each of the files. What's the best way to verify the integrity of my files with python?
For example, here's how I would download the Debian 10 net installer SHA256SUMS
digest file and download/verify its the MANIFEST
file in BASH
user@host:~$ wget http://ftp.nl.debian.org/debian/dists/buster/main/installer-amd64/current/images/SHA256SUMS
--2020-08-25 02:11:20-- http://ftp.nl.debian.org/debian/dists/buster/main/installer-amd64/current/images/SHA256SUMS
Resolving ftp.nl.debian.org (ftp.nl.debian.org)... 130.89.149.21, 2001:67c:2564:a120::21
Connecting to ftp.nl.debian.org (ftp.nl.debian.org)|130.89.149.21|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75295 (74K)
Saving to: ‘SHA256SUMS’
SHA256SUMS 100%[===================>] 73.53K 71.7KB/s in 1.0s
2020-08-25 02:11:22 (71.7 KB/s) - ‘SHA256SUMS’ saved [75295/75295]
user@host:~$ wget http://ftp.nl.debian.org/debian/dists/buster/main/installer-amd64/current/images/MANIFEST
--2020-08-25 02:11:27-- http://ftp.nl.debian.org/debian/dists/buster/main/installer-amd64/current/images/MANIFEST
Resolving ftp.nl.debian.org (ftp.nl.debian.org)... 130.89.149.21, 2001:67c:2564:a120::21
Connecting to ftp.nl.debian.org (ftp.nl.debian.org)|130.89.149.21|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1709 (1.7K)
Saving to: ‘MANIFEST’
MANIFEST 100%[===================>] 1.67K --.-KB/s in 0s
2020-08-25 02:11:28 (128 MB/s) - ‘MANIFEST’ saved [1709/1709]
user@host:~$ sha256sum --check --ignore-missing SHA256SUMS
./MANIFEST: OK
user@host:~$
What is the best way to do this same operation (download and verify the integrity of the Debian 10 MANIFEST
file using the SHA256SUMS
file) in python?
The following python script implements a function named integrity_is_ok()
that takes the path to a SHA256SUMS
file and a list of files to be verified, and it returns False
if any of the files couldn't be verified and True
otherwise.
#!/usr/bin/env python3
from hashlib import sha256
import os
# Takes the path (as a string) to a SHA256SUMS file and a list of paths to
# local files. Returns true only if all files' checksums are present in the
# SHA256SUMS file and their checksums match
def integrity_is_ok( sha256sums_filepath, local_filepaths ):
# first we parse the SHA256SUMS file and convert it into a dictionary
sha256sums = dict()
with open( sha256sums_filepath ) as fd:
for line in fd:
# sha256 hashes are exactly 64 characters long
checksum = line[0:64]
# there is one space followed by one metadata character between the
# checksum and the filename in the `sha256sum` command output
filename = os.path.split( line[66:] )[1].strip()
sha256sums[filename] = checksum
# now loop through each file that we were asked to check and confirm its
# checksum matches what was listed in the SHA256SUMS file
for local_file in local_filepaths:
local_filename = os.path.split( local_file )[1]
sha256sum = sha256()
with open( local_file, 'rb' ) as fd:
data_chunk = fd.read(1024)
while data_chunk:
sha256sum.update(data_chunk)
data_chunk = fd.read(1024)
checksum = sha256sum.hexdigest()
if checksum != sha256sums[local_filename]:
return False
return True
if __name__ == '__main__':
script_dir = os.path.split( os.path.realpath(__file__) )[0]
sha256sums_filepath = script_dir + '/SHA256SUMS'
local_filepaths = [ script_dir + '/MANIFEST' ]
if integrity_is_ok( sha256sums_filepath, local_filepaths ):
print( "INFO: Checksum OK" )
else:
print( "ERROR: Checksum Invalid" )
Here is an example execution:
user@host:~$ wget http://ftp.nl.debian.org/debian/dists/buster/main/installer-amd64/current/images/SHA256SUMS
--2020-08-25 22:40:16-- http://ftp.nl.debian.org/debian/dists/buster/main/installer-amd64/current/images/SHA256SUMS
Resolving ftp.nl.debian.org (ftp.nl.debian.org)... 130.89.149.21, 2001:67c:2564:a120::21
Connecting to ftp.nl.debian.org (ftp.nl.debian.org)|130.89.149.21|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75295 (74K)
Saving to: ‘SHA256SUMS’
SHA256SUMS 100%[===================>] 73.53K 201KB/s in 0.4s
2020-08-25 22:40:17 (201 KB/s) - ‘SHA256SUMS’ saved [75295/75295]
user@host:~$ wget http://ftp.nl.debian.org/debian/dists/buster/main/installer-amd64/current/images/MANIFEST
--2020-08-25 22:40:32-- http://ftp.nl.debian.org/debian/dists/buster/main/installer-amd64/current/images/MANIFEST
Resolving ftp.nl.debian.org (ftp.nl.debian.org)... 130.89.149.21, 2001:67c:2564:a120::21
Connecting to ftp.nl.debian.org (ftp.nl.debian.org)|130.89.149.21|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1709 (1.7K)
Saving to: ‘MANIFEST’
MANIFEST 100%[===================>] 1.67K --.-KB/s in 0s
2020-08-25 22:40:32 (13.0 MB/s) - ‘MANIFEST’ saved [1709/1709]
user@host:~$ ./sha256sums_python.py
INFO: Checksum OK
user@host:~$
Parts of the above code were adapted from the following answer on Ask Ubuntu: