Index ¦ Archives ¦ Atom

WSGI 2.0 Round 2

Cory Benfield has started a thread on the Python Web-SIG list: WSGI 2.0 Round 2: requirements and call for interest. This is a lightly edited version of my response.

TL;DR: What do you believe WSGI 2.0 should and should not do? Should we do it at all?

TL;DR: WSGI itself should have some light cleanups and bug fixes and have de-facto behaviors formalized and then be blessed as the treasure that it is. A new something-else-that-is-not-WSGI should be prepared that addresses modern protocols and programming paradigms.

I should disclaim myself by saying that I entered into web programming prior to the existence of Perl's, so I have a biased mental model about how the web™ ought to work that is not well aligned with current practices. The experience greatly informs my thinking on this stuff, not necessarily in a good way.

To me the best thing about WSGI, at least in the early days, was how it created an accessible environment for doing web stuff. It was so successful that many other languages felt left out and made their own similar interfaces. Like most successful technologies what made it a success was not the capabilities it provided but the constraints it imposed. To the application developer WSGI is relatively simple.

For old school web stuff this works great: get some headers, get some request body, do some work, send some headers, send a response body. That's nice. If you're really feeling fancy you can nest another app in the "do some work" part. That's nice too.

There's a temptation with the advent of new technologies and practices to forget the value of constraints and want to provide knobs for all the cool stuff. While I think we need to address the new use cases, we need to be careful to keep it tidy.

Keeping things tidy is going to be hard. The request-then-response handling in most of WSGI is tidy because it models itself as a gateway; there's a clear boundary. To get benefits from HTTP/2, WebSockets and async programming models the gateway model isn't going to apply.

So what to do? I think we need to do two separate things. One is to fix WSGI in some lightweight fashion. The other is to create something new that supports the new stuff but learns from what WSGI has done well (perhaps more in terms of constraints and social context than specific techniques).

As someone who writes their WSGI applications as functions that take start_response and environ and doesn't bother with much framework the things I would like to see in a minor revision to WSGI are:

  • A consistent way to access the raw un-decoded request URI. This is so I can reconstruct a realistic PATH_INFO that has not been subjected to destructive handling by the server (e.g. apache messing with %2F) before continuing on to a route dispatcher.

  • More consistent guidelines on string handling in headers. I'm aware of the discussion around ISO-8859-1 and I think it is just wrong. At the level of my simple application the header values should either be strings (unicode) or UTF-8 encoded bytes. Nothing else. If somebody sends the wrong thing it is their fault, not mine, they should get the pain not me. I realize that this is not entirely a WSGI issue, but it is something I'd like WSGI to help me with.

For WSGI, that's probably enough (for me). It has been and will continue to be a very useful tool for many years to come. For "simple" web apps it is great and we're still going to want and need those for a long time.

For applications that want to take advantage of asynchrony, we should make something new, probably coroutine-based. I'm insufficiently up on the state of the art these days to go into too much detail but some things I think worth considering are:

  • Corey gave a presentation at PyconUK about layering tools like an onion. Requests is the canonical example: requests over urllib3 over http libraries over socket libraries. I suspect a modern web services interface is going to need to provide ways to escape down to lower layers to satisfy all use cases but will need to have a fairly simple top layer to have any adoption. This is something that will need to built into the specification otherwise each implementation will come up with its own way and the portability and server independence that was key to WSGI's success will not be there.

  • Avoid complexity with regard to byte and string handling by being Python 3 only. This also avoids complexity with regards to asycnio. It's basically a statement of "if you want to do the new stuff, you got to use the new stuff". I think that's a fair requirement.

  • I like, very much, that the interface to WSGI is a callable with a very simple signature and response. I'd be disappointed if the signature of the new thing was something other than primitives.

Thanks for getting this ball rolling, again. It can be challenging when it has been visited so many times. I think making a clean break to create something new, while still acknowledging the continued usefulness of the old, is the way to go.

© Chris Dent. Built using Pelican. Theme by Giulio Fidente on github.