I want to send an HTML page to the web browser encoded as UTF-8. However the following example fails:
from wsgiref.simple_server import make_server
def app(environ, start_response):
output = "<html><body><p>Räksmörgås</p></body></html>".encode('utf-8')
start_response('200 OK', [
('Content-Type', 'text/html'),
('Content-Length', str(len(output))),
])
return output
port = 8000
httpd = make_server('', port, app)
print("Serving on", port)
httpd.serve_forever()
Here's the traceback:
Serving on 8000
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/wsgiref/handlers.py", line 75, in run
self.finish_response()
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/wsgiref/handlers.py", line 116, in finish_response
self.write(data)
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/wsgiref/handlers.py", line 202, in write
"write() argument must be a string or bytes"
If I remove the encoding and simply return the python 3 unicode string, the wsgiref server seems to encode in whatever charset the browser specifies in the request header. However I'd like to have this control myself as I doubt I can expect all WSGI servers to do the same. What should I do to return a UTF-8 encoded HTML page?
Thanks!
You need to return the page as a list:
def app(environ, start_response):
output = "<html><body><p>Räksmörgås</p></body></html>".encode('utf-8')
start_response('200 OK', [
('Content-Type', 'text/html; charset=utf-8'),
('Content-Length', str(len(output)))
])
return [output]
WSGI is designed that way so that you could just yield
the HTML (either complete or in parts).