I am trying to fetch all the attachments of email messages and make a list of those attachments for that particular mail and save that list in a JSON file.
I have been instructed to use imaplib
only.
This is the function that I am using to extract the mails data but the part.getfilename()
is only returning one attachment even if I have sent multiple attachments.
The output I want is the list of attachments like [attach1.xlss, attach2.xml, attch.csv]
.
Again, I can only use imaplib
library.
I also don't want to have to download any attachment, so please don't share that code. I tried several websites but couldn't find anything that I could use.
def get_body_and_attachments(msg):
email_body = None
filename = None
html_part = None
# if the email message is multipart
if msg.is_multipart():
# iterate over email parts
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
try:
# get the email body
body = part.get_payload(decode=True).decode()
except:
pass
if content_type == "text/plain" and "attachment" not in content_disposition:
# print text/plain emails and skip attachments
email_body = body
elif "attachment" in content_disposition:
# download attachment
print(part.get_filename(), "helloooo")
filename = part.get_filename()
filename = filename
else:
# extract content type of email
content_type = msg.get_content_type()
# get the email body
body = msg.get_payload(decode=True).decode()
if content_type == "text/plain":
email_body = body
if content_type == "text/html":
html_part = body
return email_body, filename, html_part
It was easy; I just had to do this.
import re
# getting filenames
filenames = mailbox.uid('fetch', num, '(BODYSTRUCTURE)')[1][0]
filenames = re.findall('\("name".*?\)', str(filenames))
filenames = [filenames[i].split('" "')[1][:-2] for i in range(len(filenames))]
Explanation: mailbox.uid
will fetch the message (or mail) of a particular uid (num) and will return a byte string with all the data relating to that message.
Now I use re.findall
to find all the attachment names and then I clean that return value and save it as a list.