I'm trying to save an .eml file from OutLook Express but the file saved has in some lines the character "=" (equal) at the end of line.
When editing the message in souce mode the html seems normal, just save it and the "=" appears.
It is not good for me because I will edit the .eml in my application before send it. I have to find the /BODY tag and insert a text. An exemple:
</DIV></DIV></DIV></DIV></DIV><FONT=20
style=3D"FONT-STYLE: normal; FONT-FAMILY: calibri; COLOR: rgb(0,0,0); =
FONT-SIZE: small; FONT-WEIGHT: normal"=20
face=3DCalibri><A=20
target=3D_blank></A></FONT></DIV></DIV></DIV></DIV></DIV></DIV></DIV></BO=
DY></HTML>
In this case, I can't find the body because it is writed BO=DY.
I have tried save it in various codifications, but same result. Why OutLook is saving it this way?
Outlook is using =
as an escape symbol. If X
and Y
are hex digits =XY
must be substituted with the character with ASCII code XY
. If =
is followed by a newline this newline must be removed and the lines joined.
Outlook does this because only a limited range of byte values can be transferred safely via SMTP and bytes outside of this range must be quoted. Also there are limits on the line length, so Outlook by default splits lines that are longer than 75 bytes long. I believe this is called Quoted-Printable
encoding.
Check the Content-Transfer-Encoding:
header in the .eml file and run the file through a decoder before applying your filter. And encode it again after filtering.