I Have df that looks like this:
email id
{'email': ['test@test.com']} {'id': ['123abc_d456_789_fgh']}
when I drop non alphanumeric characters like so:
df.email = df.email.str.replace('[^a-zA-Z]', '')
df.email = df.email.str.replace('email', '')
df.id = df.id.str.replace('[^a-zA-Z]', '')
df.id = df.id.str.replace('id', '')
The columns look like this:
email id
testtestcom 123abcd456789fgh
How do I tell the code to not drop anything in the square brackets but drop all non alpha numeric characters outside the brackets?
New df should like this:
email id
test@test.com 123abc_d456_789_fgh
This is hardcoded, but works:
df.email = df.email.str.replace(".+\['|'].+", '')
df.id = df.id.str.replace(".+\['|'].+", '')
>>> 'test@test.com'
>>> '123abc_d456_789_fgh'