There is a problem when pandas reads read_sql
bytes column.
If you look at the sql request through DBeaver, then the bytes column shows differently, but if you look at it through read_sql
, it seems that pandas translates the value into a hex.
For example, pandas shows column value -
b'\x80\xbc\x10`K\xa8\x95\xd8\x11\xe5K\xf9\xe7\xd7\x8cq'
I need -
0x80BC10604BA895D811E54BF9E7D78C71
If I use, in sql,
CONVERT(varchar(max),guid,1)
pandas give correct values. But I need to convert column in python not in sql.
It looks like the column (which I call 'col'
below) contains bytes. There's the .hex()
method that you can map to each item in the column to convert them into hexadecimal strings.
df['col'] = df['col'].map(lambda e: e.hex())
This produces
80bc10604ba895d811e54bf9e7d78c71
It seems the specific output you want is to have the "0x" prefix and have upper-case letters for the digits above 9, which you can do by calling .upper()
on the hex string and prepend '0x'.
df['col'] = df['col'].map(lambda e: '0x' + e.hex().upper())
This produces
0x80BC10604BA895D811E54BF9E7D78C71
You can also convert to int first and format that into a string afterwards:
df['col'] = df['col'].map(int.from_bytes).map('0x{:X}'.format)
If the column has unexpected values (such as None or a string), you can also modify the mapped function to include a condition that handles such cases. For example, if there could be None, you can safeguard against that with the following:
df['col'] = df['col'].map(lambda e: '0x' + e.hex().upper() if e else e)
If there could be a string, then:
df['col'] = df['col'].map(lambda x: x.encode().hex() if isinstance(x, str) else x.hex())