pythonemailmime-typeseml

Python - eml file edit


I can download the eml file using mime-content. I need to edit this eml file and delete attachments. I can look up attachment names. If I understand correctly, the first is the email header, the body, and then the attachments. I need advice on how to delete attachments from the body of an email.

import email
from email import policy
from email.parser import BytesParser
with open('messag.eml', 'rb') as fp:  # select a specific email file
    msg = BytesParser(policy=policy.default).parse(fp)
    text = msg.get_body(preferencelist=('plain')).get_content()
    print(text)  # print the email content
    for attachment in attachments:
        fnam=attachment.get_filename()
        print(fnam) #print attachment name

Solution

  • The term "eml" is not strictly well-defined but it looks like you want to process standard RFC5322 (née 822) messages.

    The Python email library went through an overhaul in Python 3.6; you'll want to make sure you use the modern API, like you already do (the one which uses a policy argument). The way to zap an attachment is simply to use its clear() method, though your code doesn't correctly fetch the attachments in the first place. Try this:

    import email
    from email import policy
    from email.parser import BytesParser
    
    with open('messag.eml', 'rb') as fp:  # select a specific email file
        msg = BytesParser(policy=policy.default).parse(fp)
        text = msg.get_body(preferencelist=('plain')).get_content()
        print(text)
        # Notice the iter_attachments() method
        for attachment in msg.iter_attachments():
            fnam = attachment.get_filename()
            print(fnam)
            # Remove this attachment
            attachment.clear()
    
    with open('updated.eml', 'wb') as wp:
        wp.write(msg.as_bytes())
    

    The updated message in updated.eml might have some headers rewritten, as Python doesn't preserve precisely the same spacing etc in all headers.