pythonpython-3.xemailemail-attachmentsemail-parsing

How to make difference between email html body and html attachment in python?


I use email.message.Message object to operate email message and need to extract only html attachment, while message can have html body, so looks like get_content_type() makes no sense here.

Is there any simple way in python to determine if it is body part or attachment?

UPD:

Simplified former function looks like this:

def get_attachments(mail):            
    for part in mail.walk():
        if part.get_content_type() in ('application/pdf', 'image/png', 'image/jpeg'):
            yield part

Solution

  • Everything was much easier than I expected:

    def get_attachments(mail):            
        for part in mail.walk():
            disposition = part.get('Content-Disposition')
            if disposition and 'attachment' in disposition:
                yield part