pythonhashbitcoinbitcoin-testnet

Getting Hash160 bitcoin address in python


tl;dr How should one perform Hash160, using most basic python tools?

====================================================

Hi,

I'm trying to figure out, how transactions work in bitcoin.

When I choose inputs for a new tx I want to make sure they belong to a specific address. However, existing txs do not specify previous outputs' addresses, but instead the contain address's hash.

e.g.:

>> bx fetch-tx 11a1b7ac0a65bd50b7094c720aecd77cfd83d84b1707960fd00dd82a888aab5c --config /home/theo/Desktop/bx-testnet.cfg

{
    hash 11a1b7ac0a65bd50b7094c720aecd77cfd83d84b1707960fd00dd82a888aab5c
    inputs
    {
        input
        {
            address_hash f3b7278583827a049d6be894bf7f516178a0c8e6
            previous_output
            {
                hash 4a3532061d43086299ae9b2409a456bb9638dff32e0858c4ccda27203fb2e4f6
                index 1
            }
            script "[30440220146b8b5b014245a9e27e21122d4dded04c3f39c3a49ac2494743d6f6ae8efff602206d417a4be9c7431ea69699132438510ade1cf8d746607f77d114907762ed1eb301] [023dd2e892290e41bb78efce6ea30a97015ef13eaaa9ebb7b0514485fc365cc391]"
            sequence 4294967295
        }
    }
    lock_time 0
    outputs
    {
        output
        {
            address_hash a73706385fffbf18855f2aee2a6168f29dbb597e
            script "dup hash160 [a73706385fffbf18855f2aee2a6168f29dbb597e] equalverify checksig"
            value 130000000
        }
        output
        {
            address_hash ad6e80394af99ece5d7701bf2f457480b93965b7
            script "dup hash160 [ad6e80394af99ece5d7701bf2f457480b93965b7] equalverify checksig"
            value 49525957813
        }
    }
    version 1
}

Say, I want to check which of the outputs can be sent from address mvm74FACaagz94rjWbNmW2EmhJdmEGcxpa So I take its Hash160 in Python:

>> hashlib.new('ripemd160', hashlib.sha256("mvm74FACaagz94rjWbNmW2EmhJdmEGcxpa".encode('utf-8')).digest()).hexdigest()
'748598cd9b004aecf8a2d97464fb1f2a90562ffe'

That is not the result I expected: a73706385fffbf18855f2aee2a6168f29dbb597e

Meanwhile, this online service calculates hash correctly.

How do I Hash160 a bitcoin address in Python?


Solution

  • Finally I've made it. Some revelations in my answer may look obvious and basic to you, but I hope they'll be helpful to the ppl that are completely new to bitcoin (such as me).

    ========

    Wiki says that I can get Hash160 by reversing last step of address production

    enter image description here (Hash160 is higlighted)

    This step is encoding a byte string with base58 alphabet

    b58 = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
    

    This alphabet lacks 0, I, l, O, as these symbols are easy to mix up. And that's the last thing you wanna do when one wrong symbol may lead to losing a fat amount of money.

    Thus, we need to turn mvm74FACaagz94rjWbNmW2EmhJdmEGcxpa into a byte-string. Bytes go in hexadecimal format and may range from 0x00 (0 decimal) to 0xff (255 decimal). And mind that we have a special b58 alphabet to deal with: decoding the address with utf-8 or other encoding standarts will yield nonsense.

    At first I thought that I can easily decode the address with this function:

    def decode(addr):
        b58 = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
        decoded = ''
        for i in addr:
            temp = hex(b58.index(i))
            if len(temp) == 3:
                temp = '0' + temp[-1]
            else:
                temp = temp[2:]
            decoded += (temp)
        return (decoded)
    
    decode('mvm74FACaagz94rjWbNmW2EmhJdmEGcxpa')
    
    >> '2c352c06030e090b212127390803312a1d22152c1d010d2c2811242c0d0f23372f21'
    

    But the result is nothing like the hash I looked up in the transaction (a73706385fffbf18855f2aee2a6168f29dbb597e). This way I learned I had no idea how decoding is done. What if Hash160 has 0xff ? There's no such symbol in b58, as 58 in hex is just 0x3a. While decoding b58 we can't treat each symbol independently. The whole address makes up one giant number written in base58 numerical system (its first digit corresponds to 58**34).

    To get the byte string I first turned this number into a decimal and only then in byte-string.

    If you know how to avoid this detour and get bytes directly -- please comment

    def decode(addr):
    
        b58 = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
    
        def base58_to_dec(addr):
            dec = 0
            for i in range(len(addr)):
                dec = int(dec * 58 + b58.index(addr[i]))
            print('Decimal representation')
            print(dec)
            return(dec)
    
        def dec_to_byte(dec):
            out = ''
            while dec != 0:
                print(dec)
                remn = dec % 256
                dec = int((dec - remn) / 256)
                temp = hex(remn)
                if len(temp) == 3:
                    temp = '0' + temp[-1]
                else:
                    temp = temp[2:]
                out = temp + out
            return(out)
    
        dec = base58_to_dec(addr)
        out = dec_to_byte(dec)
        return (out)
    
    decode("mvm74FACaagz94rjWbNmW2EmhJdmEGcxpa")
    >> Decimal representation
    >> 700858390993795610399098743129153130886272689085970101576715
    >> '6fa7370638600000000000000000000000000000000000000b'
    

    That output looks somewhat like what I need (a7370638...) but has way too many zeroes. Don't look that the first byte (6f) doesn't match: it has nothing to do with Hash160 we need, just protocol version.

    This is likely a precision error. To deal with it, I used mpmath which lets you operate with integers precisely.

    from mpmath import *
    mp.dps = 1000
    
    def decode(addr):
    
        b58 = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
    
        def base58_to_dec(addr):
            dec = 0
            for i in range(len(addr)):
                dec = int(dec * 58 + b58.index(addr[i]))
            return(dec)
    
        def dec_to_byte(dec):
            out = ''
            while dec != 0:
                remn = mpf(dec % 256)
                dec = mpf((dec - remn) / 256)
                temp = hex(int(remn))
                if len(temp) == 3:
                    temp = '0' + temp[-1]
                else:
                    temp = temp[2:]
                out = temp + out
    
            return (out)
    
        dec = base58_to_dec(addr)
        out = dec_to_byte(dec)
        return (out)
    

    Apply precise modulo operation and we can finally get the Hash160. Just make sure you trim the first and last 4 bytes that carry fat-finger check.

    x = decode('mvm74FACaagz94rjWbNmW2EmhJdmEGcxpa')
    print(x)
    >> 6fa73706385fffbf18855f2aee2a6168f29dbb597ef59c240b
    
    print(x[2:-8])
    >> a73706385fffbf18855f2aee2a6168f29dbb597e
    

    Yay! Just like in the transaction!