by novalis

I did a bit more work on the Python optimizer last week. This time, the problem was tuple assignment. Consider the code:

a, (b, c) = d, e = 1, (2, 3)

This would get translated to:

LOAD_CONST 1
LOAD_CONST (2, 3)
BUILD_TUPLE 2 # builds the tuple (1, (2, 3)
DUP_TOP       # duplicates the top item on the stack, since
              # it's going to be assigned to two targets
UNPACK_SEQUENCE 2 #now, push each element of the tuple we just
                  #built onto the stack
STORE_FAST a
UNPACK_SEQUENCE 2 #unpack (2, 3)
STORE_FAST b
STORE_FAST c
UNPACK_SEQUENCE 2 #unpack the duplicate
STORE_FAST d
STORE_FAST e

This is stupid, because it does a bunch of packing and unpacking of tuples. Tom Lee’s patch improves the situation by recognizing (1, (2, 3)) as a constant, and storing it in the constants table so that it does not need to be created on the fly. In other words, it replaces the first three operations of the above with:

LOAD_CONST (1, (2, 3))

This is much faster, but it’s still not optimally fast. The above Python code is equivalent to:

a = 1
d = 1
b, c = e = 2, 3

This produces the following bytecode:

LOAD_CONST 1
STORE_FAST a
LOAD_CONST 1
STORE_FAST b
LOAD_CONST (2, 3)
DUP_TOP
UNPACK_SEQUENCE 2
STORE_FAST b
STORE_FAST c
STORE_FAST e


My latest patch to the optimizer does this conevrsion. At a slight space cost, the sequence could be further reduced by replacing the DUP_TOP UNPACK_SEQUENCE with a LOAD_CONST for each element. I decided not to do this, because not everyone will want to make time-space tradeoffs, and I’m not 100% sure it would be faster.

 

Filed November 18th, 2008 under Uncategorized

­I spent some time over the past few weeks looking into the internals of the Python compiler and bytecode interpreter.

First, some general impressions. The code is very clean C code. It actually uses Python objects internally for things that are annoying to safely represent in C, such as strings, vectors (Python lists), and hash tables (Python dicts). This means that readers can piggyback on their existing understanding of Python. The bytecode interpreter is also straightforward and readable.

The Python virtual machine is stack-based. Python bytecode is simply a list of operations with either zero or one sixteen-bit operand, depending on the operation. For example, a common section of code is:

...
LOAD_FAST 0
GET_ITER
FOR_ITER
STORE_FAST 1
...

LOAD_FAST and STORE_FAST give access to local variables. Every function has an array of locals, and the operand of LOAD_FAST/STORE_FAST is an index into that array. LOAD_FAST puts the value of the local onto the stack, and STORE_FAST pops a value off the stack and puts it into the local. GET_ITER gets an iterator for an iterable — it replaces the top of the stack with the iterator. FOR_ITER pushes the next value from the iterator on top of the stack on to the stack. You’re expected to pop it — the next operation is nearly always STORE_FAST. In fact, the Python bytecode interpreter has a predictor which quickly checks if the next operation is STORE_FAST, and skips some tests if so. So that snippet of code is the preamble of a for loop. Local variable 0 is the iterable, and local variable 1 is the iteration variable.

If you want to see the bytecode for a function, the dis module will print it. Unfortunately, that’s all it will do — it won’t give it to you in a manipulable form. If you want to write your own functions in Python bytecode, you need peak.util.assembler, which does not in any way interoperate with dis. To check how Python interprets a given opcode, check in the Python source code, Python/ceval.c.

The python compiler works in a fairly standard fashion — the source code is tokenized, the tokens become a concrete syntax tree, the CST becomes an abstract syntax tree, and the AST is compiled to bytecode. I started looking at the peephole optimizer, which operates on compiled bytecode (that is, an array of chars). I implemented a minor optimization for the case:

r = ...
return r

The bytecode looked like:

STORE_FAST n
LOAD_FAST n
RETURN_VALUE

My patch replaced it with just:

RETURN_VALUE

My patch was rejected, because apparently the additional six lines of code didn’t buy enough of a speed improvement for an uncommon case. In retrospect, however, it also didn’t work for all bytecode (although it may have worked for all bytecode produced by the Python compiler). Imagine if there were a jump to the LOAD_FAST from somewhere else in the code — for example, in the code

if x:
  r = 1
else:
  r = 2
return r

The last seven bytes of the bytecode would still initially look like:

...
STORE_FAST n
LOAD_FAST n
RETURN_VALUE

But if the store and load were removed, then the jump would lead directly to the return, which would result in a stack underflow.

This neatly illustrates why doing anything on the bytecode level is a huge hassle. I went through a number of schemes to try to eliminate unnecessary store/load pairs at the bytecode level, but trying to figure out variable lifespans in the presence of forward and backwards jumps is not easy.

I decided instead to take a suggestion made when I submitted my (broken) patch, and work at the AST level. Thomas Lee has been working on an AST-level optimizer. He wrote a set of functions that walk the AST and perform optimizations at each level. I also changed my focus, because I was a bit bored of thinking about local variables, and I wanted something easy for my first look at high-level C code in a long time.

Python bytecode includes at least one operation which is not directly accessible from Python code: LIST_APPEND, which pops a value and a list off the stack and appends the value to the list. This is used in generating code for list comprehensions:

[x for x in somelist]

generates the following code for the body of the loop:

...
LOAD_FAST 1#a local is generated to hold the new list
LOAD_FAST 2 #the iteration variable, x
LIST_APPEND
...

Whereas if you were to try to write the comprehension with a for loop, you would get:

LOAD_FAST 1
LOAD_ATTR 0 #an index into the constant table: "append"
LOAD_FAST 2
CALL_FUNCTION 1 #remove x and the append function from the top of the stack and push the result of the call
POP_TOP

This is much slower. In part, it’s the cost of doing the attribute lookup each time, which is why it’s common to see code like

a = []
ap = a.append
for x in .....:
  ap(x + ...)
return a

But the function call is naturally slower than the simpler LIST_APPEND operation. My new patch tries to determine statically if a local is all circumstances holds a list, and if so, generates LIST_APPEND whenever user code would call local.append. It’s not perfect — in particular, I could track local types on a more fine-grained level than per-function. But it’s a start. I submitted it today.

Next, I think I’ll consider local elimination again — but at the AST level.­

Filed November 3rd, 2008 under Uncategorized

The other day I saw that Rollie had tagged http://www.kottke.org/08/09/tech-conference-panels-suck for inclusion in the PlanetDev feed.

I found that seeing it there and reading it really upset me.  Jackie saw me get upset and helped me realize that there were actually some broadly useful points to share about why it did and what this means in a greater context, so at her suggestion I’m going to try to unpack them here.

The referenced post — and the post that *it* references — is basically contentless.  Some popular blogger is linking to someone’s writeup of their personal impressions of a single panel at a tech conference.  He’s not adding any particular commentary of his own so presumably it’s an approving reference.  So for the essential content we go to the  original reference itself.  That post is equally contentless: as Kottke correctly observes with his summary headline, the post is meant to put forth a general theory about technical conferences and the “geeks” who organize and attend them  via a proof by anecdote.  It says: “I attended this one event, I felt this way about it, therefore I can offer general observations on a whole class of events and people.”  Well, we all know how rigorous that line of argument is.

Just for fun, though, let’s take it to its logical conclusion.  What, actually, is being put forth here?  That panels at tech conferences suck — because some guy on the internet went to one conference panel with an absurdly broad topic and didn’t get anything out of it except a lousy blog post — and so, I suppose, we should conclude that we shouldn’t be spending our time, energy and money on tech conferences?  Is this really a conversation worth having?  Of course there are good conferences and less valuable conferences in every field and discipline, from geology to nursing to quilting.  And at each conference, the individual sessions and panels will have a whole range of quality and outcomes.  To draw any conclusions about the broad category of tech conferences from a single failed panel at a single event is ignoring this reality and generalizing to what should be an obvious point of absurdity.  Come on.

So, essentially, it feels like this tagged item simply doesn’t contribute anything to a conversation; it’s a dead-end post with no information to impart and no worthwhile lessons to be drawn.  Its appearance on PlanetDev is an invasion of our communal space, and we all individually waste our time discovering its lack of value.

The world is full of those little distractions, though, and while I don’t like them, I don’t usually get too emotional when I see a Google Sponsored Link.  But of course this post’s content is not just unproductive; it’s unabashedly, gleefully insulting, playing to offensive stereotypes.  Har, har, tech “geeks” have no social skills, have no grounding in reality, get excited about techno-fantastical topics, aren’t good at explaining themselves.  Oh, and let’s mock their various supposed developmental disorders and drug indulgences!

This type of content sort of bothers me personally, but it points to something broader than just that.  While I’m sure this was totally unintentional, by putting this on PlanetDev Rollie effectively just pushed this type of stereotyping and mockery out into our group.  And, honestly, I really hope that we’re better than that, that we can create a culture of respect and collaboration here at TOPP, a safe space where no one will be mocked by a coworker for his interests or through implied association with a stereotype.  I can’t imagine anyone here is interested in descending into petty warring tribes based on our job descriptions.  So, please, let’s not get into the business of trading insults between designers and engineers, or any other “subgroups” at TOPP; let’s be respectful of each other and of everyone’s individual skills and interests, let’s work together without rivalry, and let’s respect our public spaces.

Filed September 30th, 2008 under Uncategorized

Note: since this was written I have renamed pyinstall to pip.

­Have you ever been frustrated by easy_install? Yeah, me neither. But I have heard whispers of discontent.

So I’m introducing a new tool for the installation of Python packages: pyinstall.

pyinstall is mostly easy_install compatible. That means it finds distributions in the same way as easy_install and it installs packages via setuptools. If you are familiar with easy_install you’ll know how to use pyinstall right away. This is not a repudiation of the mechanisms of easy_install, but a refinement. What does it refine?

First, it is a more user friendly easy_install. It collects everything it needs before it installs anything. The console output is intended to be concise and helpful. It says *why* it is doing things, tracking the set of dependencies that led to the installation of each package, and some of how it found the packages. It knows a little more about Subversion than easy_install, and I plan to add native support for other version control systems directly to it (this is easier now than it would have been a few years ago, since it seems there’s a clear and finite set of viable version control systems).

It also has some added features that I think are important for version management. Specifically the features of its predecessor PoachEggs, which you may not have heard of - but pyinstall at this point is at least a third-generation implementation of these ideas, if not more. You can install a set of packages from a “requirements file”, which really is little more than a list of packages to install. This is a seemingly small improvement, but the idea is to move requirements out of setup.py and into something that is managed separately from any library or package. By separating out this management you can control the application environments without having to touch the applications or libraries themselves. In addition you can generate new requirements files from a working installation: just run pyinstall --freeze=requirements.txt and it will write out a file that can be used to install the exact version of everything that is installed. One of the most common complaints about easy_install once you get second-order dependencies is that you can’t easily reproduce working environments, and this fixes that.

Another feature is a “pybundle”, which is just an archive of a set of libraries. Here’s something you can try:

$ virtualenv my-app
$ cd my-app
$ source bin/activate
... set up my-app environment just how you want it ...
$ pyinstall.py --freeze=my-app-req.txt
$ pyinstall.py --bundle=my-app.pybundle -r my-app-req.txt
... then on another machine ...
$ pyinstall.py my-app.pybundle

The pybundle format is pretty simple: it’s a lot of source files in one zip file. It isn’t a binary package format at all, it’s just source, and all it saves you is the finding and downloading of packages. But I think that simple process is a big part of why no-dependency libraries and frameworks are sought after (not to say that’s the only reason).

Despite the many rewrites preceeding this, pyinstall is still very young — but for the most part if you get something installed, then it worked, and if it didn’t get something installed then you can always fall back on easy_install (and submit a bug report).  If you have suggestions, also submit a ticket or ask about it on the virtualenv list.

To get started: easy_install pyinstall­ or jus­t grab the single-file executable.

Filed September 24th, 2008 under Uncategorized

I’m about to head off on vacation, so this seemed like as good a time as any to kick this out of my drafts folder… 

As some of you know, I’ve been doing the brainpower project as a Django admin application.

The reasons for this decision were:

  • Django admin is touted as a very quick way to build CRUD applications, since it generates a UI from your model that in many cases is good enough for end users. No forms to write, maybe just a little tweaking. Brainpower is nothing other than a simple CRUD application, so this sounded like a perfect match.
  • Good excuse to learn a little about Django.
  • Get me to do something other than Zope for once in my freakin’ life.


So how did that pan out?

Well, in one month since our first requirements-gathering meeting with our “customer” (Liz and Robin from Streetfilms), in addition to all the other stuff I’ve been doing, I built something they said was good enough to start using. I did in really just a week or so of work, alone, from scratch, with almost zero advance knowledge of Django. I even spent some time testing and tuning performance (just enough that I feel confident we won’t ever have a problem with it). This also includes a full suite of Flunc tests; a random content generation script that I used for the performance tests; and a build script for development & deployment using Fassembler.

The core code is tiny: the bulk of it is in a 250-line models.py module that is reasonably clean.

As usual, writing the core code is only a small part of the story. A large portion of my time went to things like figuring out how to conveniently run external functional tests against django with a scratch database, writing and fixing the build script, and troubleshooting my initial attempts at deployment (tripping on a django bug.).

I do have some general early impressions of Django.

Things I liked

  • The admin interface is, indeed, pretty slick (with a few minor oddnesses like a pretty useless Time widget).
  • If I had another little CRUD app to build that seemed well-matched for Django, I could probably do it ridiculously fast.
  • The Django docs are pretty decent for the most part, much better than the current state of, say, Zope 2 docs, and more extensive and thorough than the Pylons docs. (Too bad I had to quickly un-learn a bunch of stuff when I had to switch to developing against more recent django checkouts.)
  • The stable release’s way to create an admin UI — by writing an inner class named Admin inside your Model — smelled really bad to me. Thankfully, this is gone in the newforms-admin branch. Similarly, you used to wire up custom validators inside your Model class and do cleanup in its save() method; now you can do custom validation in a ModelForm subclass, and you can do data cleanup in the same place. Newforms-admin is pretty nice!
  • Got a multiple choice field that you need users to be able to extend with new choices?  Just add a foreign key field to your model referencing another model, and it just plain works in exactly the way you’d hope.  Slick!

Gripes

  • I wish they had used an existing template language, I don’t see anything compellingly different about Django’s.
  • I wish Django’s setup.py had a “develop” command.
  • I wish the tools for syncing your models to your database were more developed. You seem to be entirely on your own for migrations. For example, if you modify the type of a field in a Model, you’ll likely need to either drop your tables and recreate them, or if you have production data to preserve, either do a dump-modify-restore cycle or hand-hack the database in place to keep your app working. At one point I ended up needing to do a dump-modify-restore using xml exports and lxml transforms; it took longer than I would have liked and I might try another strategy next time. For other model modifications such as adding a field, you might be able to get away with re-running manage.py syncdb; unfortunately I don’t know how to predict when this will just work and when it won’t.
  • I wish all Django manage.py operations could be performed non-interactively. For example, for my flunc tests I wanted to reliably create and destroy a scratch database with a test admin user. Django provides a way to do most of this, but the commands that create an admin user must be run interactively. I tried using a database fixture, but that didn’t work reliably — I’m wildly guessing there’s some salt that gets reset and I don’t know when? I ended up having my test script use pexpect to drive the interactive commands.
  • It seems I arrived at an inconvenient time. Django 0.96.2 may be the “latest stable release”, but it’s aging. The developers are all focused on a “newforms-admin” branch, which is cleaner and more extensible, but this work hasn’t landed on trunk yet. I don’t know when that will happen, or what else will change before another stable release. I opted to develop against the old stable release. I thought I was being smartly conservative :-p Unfortunately I soon hit a really irritating bug. While trying to understand the admin code enough to find a workaround (and feeling like I was on a familiar learning curve), I was advised to check if trunk still had the same problem, which meant an hour or two adjusting my code to work against trunk. Trunk turned out to fix only half of the problem. Then a week later the bug got fully fixed… on the newforms-admin branch. So much for trying to stick to a stable release! Maybe I was just unlucky. If I didn’t need to search fields from a join, I wouldn’t have hit this bug. (But isn’t that a common need?)
  • Customizing and extending the admin UI is commonly done in a way that also seems vaguely familiar: You just make a copy of the thing you want to modify and hack away on it. I’ve already had a hint of pain with this — my two modified template copies broke on both Django upgrades. That’s how things typically go in the Plone world: Every single skin override you did on a Plone site would add a drop of future pain to every Plone upgrade you ever tried to do.

    This is a hard problem to solve. Just as Plone did, Django admin seems to be gradually adding more plug-in extension points so you don’t have to override the core templates as often; instead you just flip some configuration switch and/or add another template that magically gets slurped into the right place. Which has its own headaches, as every one of those bits of flexibility adds to the learning curve.  Let’s hope Greenspun’s Tenth doesn’t grow a corollary substituting Plone for Lisp :-]

Filed August 12th, 2008 under Uncategorized

Last week I attended the Chicago Google App Engine Hack-A-Thon, which was a small day-long event to hang out at the Chicago Google offices and work on things related to App Engine.

I was hoping for something structured more like a coding sprint, where people would work together on small projects. I think some other people were also expecting something like this, but the day kind of wandered by without any clear “let’s start doing stuff together” moment, and the opportunity was lost. Some people followed a tutorial given by Marzia Niccolai, some worked on their own projects, and there was a bit of chatting.

I started working on a simple project I’d started to make a very (very) simple CMS/Wiki. After playing with Javascript for a little while I was chatting about wikis and MoinMoin came up. After a quick look at the code I realized it was very tightly bound to the filesystem, and lacked even the most basic abstraction layer for storage. MoinMoin should have an abstraction layer, but I wasn’t really inclined to work on that. Instead I thought: why not try a fake writable filesystem?

I knew from the start that the idea was a bit absurd. The App Engine data store isn’t much like a filesystem. Files are quick to read and write, okay for scanning, hard to query. The data store is okay to read and write, very slow to scan, and easy to query. But I figured there was some hack value to it.

I don’t know that I really accomplished my goal, but I did get some code in appengine-monkey. It kind of worked, but my strategy was probably wrong: I simulated an entire filesystem, except those places that were mingled with code (where templates would typically be stored). Instead I should have required explicit locations that would be handled by the data store (e.g., /wiki-data). Python doesn’t expect file operations to lead to a lot of Python routines, and there were some circular situations deep in the Python code as a result. I seemed to mostly work those out, but didn’t actually get MoinMoin to render anything, I only got it to work very hard on the slow process of setting up its files.

I’ve been a little annoyed with the App Engine environment from the start, because it left out lots of routines that are present in every other environment (e.g., os.utime). You can’t use these routines in App Engine because there is no file writing. But it would all be more sensible if every call to these routines just raised a permission error. This is the kind of error that existing code understands. Instead you can’t even import the routines, making App Engine incompatible with lots of existing code. That would be fine except the incompatibility is so trivial. appengine-monkey seeks to relieve some of this, but it would be much simpler if it was just there in the platform to start with.

­Conclusion? Porting code is hard, but porting old, organically developed code like MoinMoin is really hard. And App Engine still needs a good wiki.

Filed August 5th, 2008 under Uncategorized

The Promise

In preparation for a ramp-up in testing the OpenPlans Plone 3 upgrade efforts, I’ve been revisiting OSAF’s Windmill functional testing tool. The story is compelling; if you’re going to be launching a browser and clicking around to test a site anyway, why not turn on a recorder that will auto-generate test suites that can reproduce your actions? Even if it ends up making the original testing take a little while longer, that pays for itself the first time you re-run the tests.

With something concrete to accomplish, then, I sat down to see if Windmill delivers on this promise.

Bootstrapping

Getting started went pretty smoothly. Installing was easy, on Ubuntu Gutsy, anway; just regular setuptools stuff. I’m working from the Windmill trunk, so I created and activated a python 2.5 virtualenv, checked the code out from http://svn.osafoundation.org/windmill/trunk/, and ran ‘python setup.py develop’ to install it into the environment.

Windmill is then launchable with ‘windmill firefox URL’. This will open firefox, albeit without any of the customizations that you might have set up in your profile. It also opens a smaller Windmill IDE browser window, and starts a Windmill “controller service”, which exposes the API for manipulating the browser window.

Capturing Tests

The IDE is pretty simple. It’s implemented in HTML and Javascript, and it provides 4 primary features: a test recorder, a test runner, a DOM explorer, and an “Assert” explorer. The test recorder is what I’m first interested in. I click the record button, and the browser window jumps to the foreground. So I start clicking around, entering text and submitting forms, lo and behold, I can see what I’m doing being captured in the IDE. Good start.

To start, I make an incorrect login attempt, and, upon failure, try to create a new account, getting all the way to the “check your email” confirmation screen. I turn off the recorder, and click on the ’save’ link in the IDE window. Windmill can export the test suites as either python or JSON. I’ve chosen python, so I get another window with the following text:

# Generated by the windmill services transformer
from windmill.authoring import WindmillTestClient

def test():
    client = WindmillTestClient(__name__)

    client.click(link=u'Sign in')
    client.type(text=u'bogus', id=u'__ac_name')
    client.type(text=u'bobobobo', id=u'__ac_password')
    client.click(name=u'login')
    client.click(link=u'Create account')
    client.type(text=u'testuser1', id=u'id')
    client.type(text=u'Test UserOne', id=u'fullname')
    client.type(text=u'test1@example.com', id=u'email')
    client.type(text=u'testy', id=u'password')
    client.type(text=u'testy', id=u'confirm_password')
    client.click(name=u'task|join')

I paste this into an emacs buffer and save it as windmill_tests.py. Then I shut down the windmill process (I have to ‘kill’ it to make sure everything terminates correctly :P) and pass in the test I just created with ‘windmill firefox URL test=windmill_tests.py’. Sure enough, the browser launches, hits the site, and starts following links and entering text! Initial success makes much happy.

Controller API

The ‘client’ variable in the python code above is a handle to a WindmillTestClient object, which exposes the Windmill Controller API. The controller is what is driving the browser. One fun trick is that you can put a pdb in your test code and you’ll get an interactive prompt which you can use to control the live browser window. Now enjoy the power of a python command line interface to teh interwebs, while still being able to access all the rich multimedia that today’s users demand! It’s definitely more fun than it should be to fill out web forms and click on links by typing in ‘client.type’ and ‘client.click’ commands.

Assertions

Astute readers will have noticed that the second run of the user creation test would have a different result than the first. Indeed, when I used Windmill to run the test steps I’d captured, it ended with an error message informing me that the username was already taken. Real tests of course will want to make assertions, to ensure that the behaviour is really what you want. Enter the Assertion Explorer. At any point, you can click on the Assertion Explorer button, and then click somewhere on the page, and the IDE will generate a best-guess assertion for you for the element that you selected.

The generated assertions are sometimes spot-on. Whenever I clicked on a portal status message, for instance, it asserts (using XPath) that the PSM element exists on the page, and that it contains the specified text. When I click on regular page text, however, it only checks for the existence of the clicked node, when I want it to check the text as well.

The asserts are a part of the Controller API, and are pretty easy to understand. Here’s an example of a couple that I generated:

    client.asserts.assertNode(xpath=u'/html/body/div/div/div/div/div')
    client.asserts.assertText(xpath=u'/html/body/div/div/div/div/div', validator=u'Welcome! You have signed in.')

    client.asserts.assertNode(xpath=u’/html/body/div/div/div/div/div’)
    client.asserts.assertText(xpath=u’/html/body/div/div/div/div/div’, validator=u’Your changes have been saved.’)
    client.asserts.assertNode(link=u’something else+’) 

Nothing particularly mysterious there.

The Warts

Despite the early successes, it wasn’t long before I hit a couple bumps. Here’s an overview of the issues that came up for me:

  • Hard to extract info from the page

    The Controller API works pretty well for controlling the page and making assertions, but there don’t seem to be very good hooks for extracting information from the page itself into the python environment. For instance, in order to complete the user registration, the test will need to log in as an admin and hit a special page which will return the user’s confirmation key. This will need to be extracted from the response and then used as part of the URL of a subsequent request. Even a simpler case, such as simply extracting the URL of the page that the browser is currently visiting, doesn’t come clear to me from a thorough comb of the documentation.
  • The ‘waits’ stuff seems to be broken ATM

    The Controller API has a whole selection of ‘waits’ methods, which will tell the test running to not move on until some criteria is met, either a specific amount of time has elapsed, or the page has loaded, or an element shows up on the page, etc. For me, running FF2.0.0.14 on Ubuntu Gutsy, these were completely broken. Any time I try to use one of these calls, the test suite just stops right there. This is a show-stopper, since I very quickly hit false failures due to the tests running more quickly than the browser was responding.
  • Xinha typing didn’t get picked up by the recorder

    Most of the actions I performed in the browser were dutifully recorded by the IDE. The only exception to this is any text I would enter into the Xinha editor. Whenever I would edit a wiki page, the recorder would skip directly from clicking the ‘edit’ link to clicking on the ’save’ button. I think we can work around this by just adding these commands by hand to the generated python. I’m not 100% on this, though.

Python is Nice

As I’ve mentioned, it’s possible to export the generated tests and assertions as either JSON or python code. Having such a straightforward way to drive the tests from pure python is a big plus for those of us who like that sort of thing. We can use try:finally to make sure clean-up code gets hit, and httplib2 to talk to the server to actually perform the clean-up. Control is easy in python. A quick peek at the code indicates that it does pass unrecognized options in to the test suites themselves, which means that we can code up test suites that understand additional control parameters as we need.

The Good News

While I did stumble on a couple of issues, I am very happy to report that the folks in the #windmill channel on freenode are very helpful. Even better than that, they’re very happy to get my bug reports, and are very responsive about fixing them. I spent about a day playing with Windmill a few months back. In that time I uncovered one bug and one usability weirdness. They were both fixed within days. The issues I’ve raised this time are already recorded as issues in their tracker, and I’m told that they should be resolved by the end of the week.

Conclusions

Takeaways? I’d say “cautiously wildly enthusiastic” best describes how I feel. Windmill is very close to delivering on its promise, making it ridiculously easy to generate robust, JavaScript-supporting test suites that can be trivially run on IE, Safari, and FireFox. If the pattern of developer responsiveness continues, and the issues that come up are either easily worked around or are resolved within Windmill itself, then I think this really hits the sweet spot. It didn’t take long for me to hit a couple of pretty big issues, however. If it turns out that there are a lot of similar bombs in there, that the problems stack up faster than the developers can deal with them, then it would probably end up being a headache for us.

I don’t really expect this, however. I know OSAF is using Windmill internally, and the developers that I’ve had contact with seem very enthusiastic about getting my reports and getting the problems resolved. I’m going to continue my exploration, and I hope to generate a thorough set of Windmill tests for the Plone 3-based OpenPlans stack over the next week.

Filed June 25th, 2008 under Uncategorized