htmlimapexchange-serverexchange-server-2003

Convert HTML to Plain Text?


I am able to read emails in from Microsoft Exchange using an IMAP Client from Lumisoft. I have set the exchange server settings to convert any mail to plain text. However, when I read in the information it still seems to contain HTML/CSS.

What is the best way of removing HTML/CSS from the body of an email? Or is there a setting on the exchange server I seemed to have missed?


Solution

  • I usually take one of these approaches...

    1. Using regular expressions. It can be a bit difficult to get right if you have to come up with a solution that also works with all kinds of invalid markup, but i bet someone else has done it before you (Hint: google or search SO).

    2. Using an HTML parser library. You can find one for any popular programming language out there. I recommend using the Html Agility Pack.