The question is twofold, about getting MESSAGE-ID, and using imap_tools. For an email client ("handmade") in Python I need to lessen the data amount read from the server (presently it takes 2 min to read the whole mbox folder of ~170 msg for yahoo), I believe that having MESSAGE-ID will help me.
imap_tools has IDLE command which is essential to keep the yahoo server connection alive and other features which I believe will simplify the code.
To learn about MESSAGE-ID I started with the following code (file fetch_ssl.py):
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import imaplib
import email
import os
import ssl
import conf
# Why UID==1 has no MESSAGE-ID ?
if __name__ == '__main__':
args = conf.parser.parse_args()
host, port, env_var = conf.config[args.host]
if 0 < args.verbose:
print(host, port, env_var)
with imaplib.IMAP4_SSL(host, port,
ssl_context=ssl.create_default_context()) as mbox:
user, pass_ = os.getenv('USER_NAME_EMAIL'), os.getenv(env_var)
mbox.login(user, pass_)
mbox.select()
typ, data = mbox.search(None, 'ALL')
for num in data[0].split():
typ, data = mbox.fetch(num, '(RFC822)')
msg = email.message_from_bytes(data[0][1])
print(f'num={int(num)}, MESSAGE-ID={msg["MESSAGE-ID"]}')
ans = input('Continue[Y/n]? ')
if ans.upper() in ('', 'Y'):
continue
else:
break
Where conf.py is:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import argparse
HOST = 'imap.mail.yahoo.com'
PORT = 993
config = {'gmail': ('imap.gmail.com', PORT, 'GMAIL_APP_PWD'),
'yahoo': ('imap.mail.yahoo.com', PORT, 'YAHOO_APP_PWD')}
parser = argparse.ArgumentParser(description="""\
Fetch MESSAGE-ID from imap server""")
parser.add_argument('host', choices=config)
parser.add_argument('-verbose', '-v', action='count', default=0)
fetch_ssl.py outputs:
$ python fetch_ssl.py yahoo
num=1, MESSAGE-ID=None
Continue[Y/n]?
num=2, MESSAGE-ID=<83895140.288751@communications.yahoo.com>
Continue[Y/n]? n
I'd like to understand why the message with UID == 1 has no MESSAGE-ID? Does that happen from time to time (I mean there are messages with no MESSAGE-ID)? How to handle these cases? I haven't found such cases for gmail.
Then I attempted to do similar with imap_tools (Version: 0.56.0), (file fetch_tools.py):
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import os
import ssl
from imap_tools import MailBoxTls
import conf
# https://github.com/ikvk/imap_tools/blob/master/examples/tls.py
# advices
# ctx.load_cert_chain(certfile="./one.crt", keyfile="./one.key")
if __name__ == '__main__':
args = conf.parser.parse_args()
host, port, env_var = conf.config[args.host]
if 0 < args.verbose:
print(host, port, env_var)
user, pass_ = os.getenv('USER_NAME_EMAIL'), os.getenv(env_var)
ctx = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)
ctx.options &= ~ssl.OP_NO_SSLv3
# imaplib.abort: socket error: EOF
with MailBoxTls(host=host, port=port, ssl_context=ctx) as mbox:
mbox.login(user, pass_, 'INBOX')
for msg in mbox.fetch():
print(msg.subject, msg.date_str)
Command
$python fetch_tools.py yahoo
outputs:
Traceback (most recent call last):
File "/home/vlz/Documents/python-scripts/programming_python/Internet/Email/ymail/imap_tools_lab/fetch_tools.py", line 20, in <module>
with MailBoxTls(host=host, port=port, ssl_context=ctx) as mbox:
File "/home/vlz/Documents/.venv39/lib/python3.9/site-packages/imap_tools/mailbox.py", line 322, in __init__
super().__init__()
File "/home/vlz/Documents/.venv39/lib/python3.9/site-packages/imap_tools/mailbox.py", line 35, in __init__
self.client = self._get_mailbox_client()
File "/home/vlz/Documents/.venv39/lib/python3.9/site-packages/imap_tools/mailbox.py", line 328, in _get_mailbox_client
client = imaplib.IMAP4(self._host, self._port, self._timeout) # noqa
File "/usr/lib/python3.9/imaplib.py", line 205, in __init__
self._connect()
File "/usr/lib/python3.9/imaplib.py", line 247, in _connect
self.welcome = self._get_response()
File "/usr/lib/python3.9/imaplib.py", line 1075, in _get_response
resp = self._get_line()
File "/usr/lib/python3.9/imaplib.py", line 1185, in _get_line
raise self.abort('socket error: EOF')
imaplib.abort: socket error: EOF
Command
$ python fetch_tools.py gmail
Produces identical results. What are my mistakes?
Using Python 3.9.2, Debian GNU/Linux 11 (bullseye), imap_tools (Version: 0.56.0)
EDIT
Headers from the message with no MESSAGE-ID
X-Apparently-To: vladimir.zolotykh@yahoo.com; Sun, 25 Oct 2015 20:54:21 +0000
Return-Path: <mail@product.communications.yahoo.com>
Received-SPF: fail (domain of product.communications.yahoo.com does not designate 216.39.62.96 as permitted sender)
...
X-Originating-IP: [216.39.62.96]
Authentication-Results: mta1029.mail.bf1.yahoo.com from=product.communications.yahoo.com; domainkeys=neutral (no sig); from=product.communications.yahoo.com; dkim=pass (ok)
Received: from 127.0.0.1 (EHLO n3-vm4.bullet.mail.gq1.yahoo.com) (216.39.62.96)
by mta1029.mail.bf1.yahoo.com with SMTPS; Sun, 25 Oct 2015 20:54:21 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=product.communications.yahoo.com; s=201402-std-mrk-prd; t=1445806460; bh=5PTgF8Jghm92xeMD5mSHp6A3eRVV70PWo1oQ15K7Tfk=; h=Date:From:Reply-To:To:Subject:From:Subject; b=D7ItgOiuLbiexJGHvORgbpRi22X+sYso6gwZKDXVca79DxMMy2R1dUtZTIg7tcft1lovVJUDw/7fC51orDltRidlfnpayeY8lT+94DRlSBwopuxgOqqR9oTTjTBZ0oEvdxUcXl/q54N2GxuBFvmg8UO0OZoCnFPpUVYo9x4arMjt/0TOW1Q5d/yjdmO7iwiued/rliP/Bsq0TaZYcb0oCAT7Q50tb1fB7wcXLYNSC1OCQ1l1LajbUqmU1LWWNse36mUUTBieO2sZT0ERFrHaCTaTNQSXKQG2AxYF7Dd/8i0Iq3xqdcS0bDpjmWE25uoKvCdtXtUbylsuQSChuLFMTw==
Received: from [216.39.60.185] by n3.bullet.mail.gq1.yahoo.com with NNFMP; 25 Oct 2015 20:54:20 -0000
Received: from [98.137.101.84] by t1.bullet.mail.gq1.yahoo.com with NNFMP; 25 Oct 2015 20:54:20 -0000
Date: 25 Oct 2015 20:54:20 +0000
Received: from [127.0.0.1] by nu-repl01.direct.gq1.yahoo.com with NNFMP; 25 Oct 2015 20:54:20 -0000
X-yahoo-newman-expires: 1445810060
From: "Yahoo Mail" <mail@product.communications.yahoo.com>
Reply-To: replies@communications.yahoo.com
To: <ME>@yahoo.com
Subject: Welcome to Yahoo! Vladimir
X-Yahoo-Newman-Property: ydirect
Content-Type: text/html
Content-Length: 25180
I skipped only X-YMailISG.
EDIT II
Of 167 messages 21 have no MESSAGE-ID header.
fetch_ssl.py takes 4m12.342s, and fetch_tools.py -- 3m41.965s
It looks simply like your email without a Message-ID legitimately does not have one; it appears the welcome email Yahoo sent you actually lacks it. Since it's a system generated email, that's not that unexpected. You'd just have to skip over it.
The second problem is that you need to use imap_tools.MailBox
.
Looking at the documentation and source at the repo it appears that the relevant classes to use are:
MailBox
- for a normal encrypted connection. This is what most email servers use these days, aka IMAPS (imap with SSL/TLS)MailBoxTls
- For a STARTTLS
connection: this creates a plaintext connection then upgrades it later by using a STARTTLS
command in the protocol. The internet has mostly gone to the "always encrypted" rather than "upgrade" paradigm, so this is not the class to use.MailBoxUnencrypted
- Standard IMAP without SSL/TLS. You should not use this on the public internet.The naming is a bit confusing. MailBox
corresponds to imaplib.IMAP4_SSL
; MailBoxTls
corresponds to imaplib.IMAP4
, then using startls()
on the resulting connection; and MailboxUnencrypted
corresponds to imaplib.IMAP4
with no security applied. I imagine it's this way so the most common one (Mailbox) is a safe default.