I am currently using the Gmail API to read in some HTML emails in Python. I've decoded their body using:
base64.urlsafe_b64decode
After printing out the resulting HTML email, "\r\n" and "3D" are scattered around the HTML. I can't remove the "\r\n" because the \ and r and \ and n register as different characters (?) and I'm not sure where the "3D" comes from.
Is there something wrong with how I'm decoding it?
Here is the code:
results = service.users().messages().list(userId='me', q = 'is: unread').execute()
for index in range(len(results['messages'])):
message = service.users().messages().get(userId='me', id=results['messages'][index]['id'], format='raw').execute()
msg_str = base64.urlsafe_b64decode(message['raw'].encode('UTF-8'))
mime_msg = email.message_from_string(str(msg_str))
print(mime_msg)
service.users().messages().modify(userId='me', id=results['messages'][index]['id'], body = {'removeLabelIds': ['UNREAD']}).execute() # mark message as read
I found the solution - I stopped using the email library from Python, and cast msg_str
to a string (it is of type bytes). From there, I simply deleted '\r\n'
from the string and replaced '=3D'
with '='
.