pythonpython-3.xwsgiwsgiref

How do I encode WSGI output in UTF-8?


I want to send an HTML page to the web browser encoded as UTF-8. However the following example fails:

from wsgiref.simple_server import make_server

def app(environ, start_response):
    output = "<html><body><p>Räksmörgås</p></body></html>".encode('utf-8')
    start_response('200 OK', [
        ('Content-Type', 'text/html'),
        ('Content-Length', str(len(output))),
    ])
    return output

port = 8000
httpd = make_server('', port, app)
print("Serving on", port)
httpd.serve_forever()

Here's the traceback:

Serving on 8000
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/wsgiref/handlers.py", line 75, in run
    self.finish_response()
  File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/wsgiref/handlers.py", line 116, in finish_response
    self.write(data)
  File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/wsgiref/handlers.py", line 202, in write
    "write() argument must be a string or bytes"

If I remove the encoding and simply return the python 3 unicode string, the wsgiref server seems to encode in whatever charset the browser specifies in the request header. However I'd like to have this control myself as I doubt I can expect all WSGI servers to do the same. What should I do to return a UTF-8 encoded HTML page?

Thanks!


Solution

  • You need to return the page as a list:

    def app(environ, start_response):
        output = "<html><body><p>Räksmörgås</p></body></html>".encode('utf-8')
        start_response('200 OK', [
            ('Content-Type', 'text/html; charset=utf-8'),
            ('Content-Length', str(len(output)))
        ])
    
        return [output]
    

    WSGI is designed that way so that you could just yield the HTML (either complete or in parts).