I'm using bleach to sanitize user input. But I use Markdown which means I need the blockquote >
symbol to go through without being escaped as & gt;
so I can pass it to misaka for rendering.
The documentation says by default it escapes html markup but doesn't say how to turn that off for the >
symbol. I would still like it to escape actual html tags.
http://bleach.readthedocs.org/en/latest/clean.html
Any other ideas for sanitizing input while maintaining the ability to use Markdown would be appreciated.
Do you need strip all tags, but leave > as it is?
Simple way for step 2:
output.replace('>', '>')
More professional
import HTMLParser
h = HTMLParser.HTMLParser()
s = h.unescape(sanitized user input)