hash

Is the empty string hash and null char hash not the same?


I'm using hashlib sha256 (python) to prove two inputs.

My hypothesis was that null characters and empty strings will give the same hash.

Here's my code.

from hashlib import sha256
print(sha256(b'\x00').hexdigest(),end='\n\n')
print(sha256(b'').hexdigest())

And it gave output.

6e340b9cffb37a989ca544e6bb780a2c78901d3fb33738768511a30617afa01d

e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

Why did they not give the same result?

Is there a relation with the C language string format in which the string always ends with null? So when I hash null, it will hash double null?


Solution

  • An empty string is (or, strictly speaking, "encodes to") a byte array of length zero, containing no bytes. You can observe this e.g. as follows, using Python:

    >>> list(bytes("", 'ascii'))
    []
    

    A string consisting of a single zero-byte on the other hand is a byte array of length one, containing a single byte of value zero:

    >>> list(bytes("\x00", 'ascii'))
    [0]
    

    As such these two inputs are different, and will hash to different values.

    As was mentioned in comments above, there is no relation to how some languages such as C represent strings, using a zero-byte to indicate their end.