pythonemailcharacter-encodingamazon-sesquoted-printable

Email quoted printable encoding confusion


I'm constructing MIME encoded emails with Python and I'm getting a difference with the same email that is MIME encoded by Amazon's SES.

I'm encoding using utf-8 and quoted-printable.

For the character "å" (that's the letter "a" with a little circle on top), my encoding produces

=E5

and the other encoding produces

=C3=A5

They both look ok in my gmail, but I find it weird that the encoding is different. Is one of these right and the other wrong in any way?

Below is my Python code in case that helps.

====

cs = charset.Charset('utf-8')
cs.header_encoding = charset.QP
cs.body_encoding = charset.QP

# See https://stackoverflow.com/a/16792713/136598
mt = mime.text.MIMEText(None, subtype)
mt.set_charset(cs)
mt.replace_header("content-transfer-encoding", "quoted-printable")
mt.set_payload(mt._charset.body_encode(payload))

Solution

  • Ok, I was able to figure this out, thanks to Artur's comment.

    The utf-8 encoding of the character is two bytes and not one so you should expect to see two quoted printable encodings and not one so the AWS SES encoding is correct (not surprisingly).

    I was sending unicode text and not utf-8 which causes only one quoted printable character. It turns out that it worked because gmail supports unicode.

    For the Python code in my question, I need to manually encode the text as utf-8. I was thinking that MIMEText would do that for me but it does not.