pythonemailrfc822

How do you extract multiple email addresses from an RFC 2822 mail header in python?


Python's email module is great for parsing headers. However, the To: header can have multiple recipients, and there may be multiple To: headers. So how do I split out each of the email addresses? I can't split on the comma, since the comma can be quoted. Is there a way to do this?

Demo code:

msg="""To: user1@company1.com, "User Two" <user2@company2.com", "Three, User <user3@company3.com>                               
From: anotheruser@user.com                                                                                                      
Subject: This is a subject                                                                                                      

This is the message.                                                                                                            
"""

import email

msg822 = email.message_from_string(msg)
for to in msg822.get_all("To"):
    print("To:",to)

Current output:

$ python x.py
To: user1@company1.com, "User Two" <user2@company2.com", "Three, User <user3@company3.com>
$ 

Solution

  • Pass all of the To lines through email.utils.getaddresses():

    msg="""To: user1@company1.com, John Doe <user2@example.com>, "Public, John Q." <user3@example.com>
    From: anotheruser@user.com
    Subject: This is a subject
    
    This is the message.
    """
    
    import email
    
    msg822 = email.message_from_string(msg)
    for to in email.utils.getaddresses(msg822.get_all("To", [])):
        print("To:",to)
    

    Note that I rewrote your To line. I believe your example wasn't a valid format.

    Reference: https://docs.python.org/3/library/email.utils.html#email.utils.getaddresses