I have a MS SQL database that I want to query with Python. I use the following snippet:
cnxn = pypyodbc.connect("Driver={SQL Server Native Client 11.0};"
"Server=xxxx;"
"Database=xxxx;"
"Trusted_Connection=yes;")
query = 'select * from orders'
df = pd.read_sql_query(query)
This returns the following dataframe:
id ordernumber
0 b'xxxx-xxxx-xxxx-xxxx' 123456789
1 b'xxxx-xxxx-xxxx-xxxx' 123456789
2 b'xxxx-xxxx-xxxx-xxxx' 123456789
The columns are both of type object
. The id
column is supposed to be a string, but it is a byte string in this output. How do I transform this column to a regular string object? I tried the following (from here):
df['id'].str.decode('utf-8')
... but that transforms the entire id
column into np.nan
values. How do I properly transform my DataFrame id
column into a regular string object?
Dict to recreate this DataFrame:
{'id': {0: "b'DE91EBA3-313D-463C-B948-00010AA26136'",
1: "b'316E587F-7FDD-4FBA-8778-0001E7783025'",
2: "b'F6E50A95-A3E8-45D3-8E79-000210CCA14A'",
3: "b'EE20A958-0CD6-4144-9743-00024D3E703F'",
4: "b'58AFA9B9-4B10-47D3-8840-000AAD9CBF00'"},
'ordernumber': {0: '500358152',
1: '500489603',
2: '500454759',
3: '500512969',
4: '500517135'}}
You don't get bytes
b'DE91EBA3-313D-463C-B948-00010AA26136'
but string
"b'DE91EBA3-313D-463C-B948-00010AA26136'"
and this makes difference.
You have to only remove b' '
from string
df['id'].str.replace("b'", "").str.replace("'", "")
or simpler
df['id'].str[2:-1]
Minimal working code:
import pandas as pd
df = pd.DataFrame({
'id': {
0: "b'DE91EBA3-313D-463C-B948-00010AA26136'",
1: "b'316E587F-7FDD-4FBA-8778-0001E7783025'",
2: "b'F6E50A95-A3E8-45D3-8E79-000210CCA14A'",
3: "b'EE20A958-0CD6-4144-9743-00024D3E703F'",
4: "b'58AFA9B9-4B10-47D3-8840-000AAD9CBF00'"
},
'ordernumber': {
0: '500358152',
1: '500489603',
2: '500454759',
3: '500512969',
4: '500517135'
}
})
#df['id'] = df['id'].str[2:-1]
df['id'] = df['id'].str.replace("b'", "").str.replace("'", "")
print(df)
Result:
id ordernumber
0 DE91EBA3-313D-463C-B948-00010AA26136 500358152
1 316E587F-7FDD-4FBA-8778-0001E7783025 500489603
2 F6E50A95-A3E8-45D3-8E79-000210CCA14A 500454759
3 EE20A958-0CD6-4144-9743-00024D3E703F 500512969
4 58AFA9B9-4B10-47D3-8840-000AAD9CBF00 500517135