by k0s

Looking at our code (that’s how it always starts) I was again bitten by that disgust for DRY code. Some of this is inevitable in a web framework, I think. Web frameworks are by their nature complex programs. Should a web framework handle authentication and permissions? Almost definitely. Should a web framework handle unicode and i18n and localization issues? One would hope so.

Python has a bunch of these frameworks and I think this is a good thing. What I do question is how much functionality lives in the framework that could be abstracted outside of the framework.  What is a framework but a tool kit that you want to apply to HTTP requests and responses?  Thinking that way, issues like handling unicode, HTML escaping, authentication, etc are really library type functions.  If Bob doesn’t like pylons but likes how they do auth, that part should be pretty libraried-out (well, there’s authkit, which is actually a good example of what I consider *bad* library code).

This isn’t a magic bullet, not do I encourage programmers to prematurely make their code library-like.  I’ve been bitten by that too.  But once one figures out process and what one *really* wants to do, its easy to figure out the pattern, figure out which parts are actually *part* of the web framework and which parts are better *consumed* by the web framework.

Filed May 2nd, 2008 under Uncategorized
  1. Irony alert: Speaking of i18n issues, in sage your post title reads as “Looking at our code (that&#82…”

    Apparently the html of this page includes it as a utf8 character, but somewhere on the way to spitting out the RSS feed, we turn that into an entity, which Sage (my feed reader) displays literally, maybe because it’s wrapped in a CDATA section. Entity-izing it seems unnecessary to me, because the feed looks to be proper XML with an explicit UTF-8 character encoding.

    On to your main point - I don’t disagree with anything you said but I’m having a hard time thinking of examples for some reason. I wonder what in particular you thought should be moved into a library.

    Comment by slinkp on May 2, 2008 at 7:36 pm

  2. Yeah, I too would like to hear more of your thoughts on this. As a specific question, why do you think authkit is bad library code? What about it do you disapprove of, and what do you envision a better authentication and/or authorization library would look like? What would it handle and what would it not handle? (Well, WSGI/CGI already provides a standard *place* for authentication information to go, in environ[’REMOTE_USER’]; and HTTP already provides a standard set of *actions* pertaining to authentication and authorization (401 and 403 responses, the authentication header) … should an auth library wrap these somehow? Does it do any more or leave it up to the user to fill in the gaps?)

    At any rate, I certainly agree with you. For a (kind of trivial) example, as you say, every web framework has to deal with HTTP requests and responses. Well, I find WebOb’s request, response and “http exception” objects to be very easy and intuitive to use. So I’d love it if all the web frameworks decided to build on WebOb as their standard provider for defining how to interact with requests and responses.

    Comment by ejucovy on May 3, 2008 at 4:38 pm

  3. Hmm, I’d love to see i18n & l10n handled in a uniform and standard way across all the python web frameworks…

    Comment by ejucovy on May 3, 2008 at 4:40 pm

  4. Heh, apologies. I had the idea for this blog post early in the week but didn’t have a chance to actually write it until Friday, and I lost much of my intentions on what to say between point A and point B. I’ll try to remedy that here.

    linkp: yeah, the i18n issue for blog post titles is known. Perhaps more ironically, its the bug I was looking at when I was posting this and will start off with Monday. This is actually a good example of what I was trying to get at. Because escaping/unescaping of these characters is largely done in an ad hoc way in our software, these characters end up double-escaped here: Ideally, what format a “field” should be in on display should be defined programmatically with some sort of markers and library code used to reach it in specified format. Does it need to be characters legal in a URI? Is it text or html with escape characters that need to be marked up? Is it safe HTML? Should it be cleaned? Are there other transforms that need to be applied?

    We largely do this on a per case basis. In a sense, this is good, as it keeps the logic close to the data it is being applied to. But how we have done this has proved to be problematic in a few cases. As best I can tell, wicked expects unescaped link text., so we unescape text that is already escaped just so we can escape again. Not only does this require special knowledge of wicked’s data format, it also introduces an added layer of complexity that cost a fair amount of time to fix. I would like to think that a good library for this sorta thing could have been better than rolling our own for special instances in this case.

    Note that none of this depends on the existence of a web-framework: you just have input sources (these being form input and stored data), rules and filters that the text has to be run through, and output (being, respectively, the data store and display). A web framework probably should do this work, but IMHO this is best done by establishing the rules and filters and delegating the work to library code.

    I’m going to pick a little on authkit. Maybe this is a bit unfair, because it does fill the needs of a large section of users. Evidentially, anyway. authkit provides two coupled things: user authentication and roles and permissions. While the latter usually sits adjacent to the formula, for me they are two different problems and shouldn’t really live together. Yes, its possible to use the authentication part and the permissions part without using the other, in practice I have found that they’re more coupled than I would like to see in code. I tried to authkit in bitsyblog and eventually figured out how to make it work, but it would have been almost as many lines of code and more configuration than what I ended up doing which is writing my own auth layer built on paste.auth. While my two typos resulting in bitsyblog being “hacked” (script-kiddied?) at pycon shows the disadvantage here, paste.auth gave me nice flexible tools to build on in a library sorta way so I could have a nice minimalist auth layer. While I couple user management with auth, if I were to release bitsyauth separately, I would keep these apart as they’re two different functions. That way, if anyone decided to build on them, they could take one and trash the other.

    I too have enjoyed working with webob and would also like to see it as a potential provider of request/response objects for various python web frameworks.

    I guess all I am really saying doesn’t amount to much except the same refrain that we all know but that becomes much easier to talk about than to do, until you actually start doing it. Your program has some need. Probably its easiest to just to solve the problem in your code initially, assuming that some well-known module doesn’t do what you want. But once this works, its usually easy to move this functionality out into your own library, even if its only used internally. At this point, if you see someoneelse’slibrary, that does what you want and better, you can swap our your library for its — assuming the API is sane, the interface will probably look much the same (after all, saying e.g. unescape(html) has basically the same interface no matter how you write it). Likewise for other people’s libraries: if they have isolated functionality, it should be easy both to swap in and out as well as contribute improvements to the code because the intention of what the software should do is quite clear. If what the software does is complex or ill-defined, then this is more difficult to do.

    Comment by k0s on May 4, 2008 at 3:48 pm

  5. i guess another way of looking at this is rather you’d build a piece of software from components, or utilize and configure one piece of software that does everything. Personally, I’d rather have my software be the wires that connect various software components rather than having my software be an extension that depends on a certain framework. This is a pretty black and white statement, and shouldn’t be taken too literally, but I find it easier and more understandable to build something than to take an existing structure and force it to do what I want. Realities make one or the other more attractive for certain problems, but to me the “building” (and implied “componentizing”) is the sort of thing that makes python look strong. I’d be so bold as to say that complex monoliths that solve complex are unpythonic.

    Comment by k0s on May 4, 2008 at 4:53 pm

Leave a comment