I'm reading the Python 3 docs here and I must be blind or something... Where does it say how to get the body of a message?
What I want to do is to open a message and perform some loop in text-based bodies of the message, skipping binary attachments. Pseudocode:
def read_all_bodies(local_email_file):
email = Parser().parse(open(local_email_file, 'r'))
for pseudo_body in email.pseudo_bodies:
if pseudo_body.pseudo_is_binary():
continue
# Pseudo-parse the body here
How do I do that? Is even Message class correct class for this? Isn't it only for headers?
This is best done using two functions:
get_payload
returns string in the message. If message is multipart, it returns list of sub-messagesThis is how it can be done:
def parse_file_bodies(filename):
# Opens file and parses email
email = Parser().parse(open(filename, 'r'))
# For multipart emails, all bodies will be handled in a loop
if email.is_multipart():
for msg in email.get_payload():
parse_single_body(msg)
else:
# Single part message is passed diractly
parse_single_body(email)
def parse_single_body(email):
payload = email.get_payload(decode=True)
# The payload is binary. It must be converted to
# python string depending in input charset
# Input charset may vary, based on message
try:
text = payload.decode("utf-8")
# Now you can work with text as with any other string:
...
except UnicodeDecodeError:
print("Error: cannot parse message as UTF-8")
return