Seems to be a theme in my reading this week: Dumb ideas are often the best ideas.
Friday, Rolando posted a link to Paul Graham’s recent “Six Principles of New Things”.
“… this is practically a recipe for generating a contemptuous initial reaction. Though simple solutions are better, they don’t seem as impressive as complex ones. Overlooked problems are by definition problems that most people think don’t matter. Delivering solutions in an informal way means that instead of judging something by the way it’s presented, people have to actually understand it, which is more work. And starting with a crude version 1 means your initial effort is always small and incomplete.”
Joel Spolsky probably read it too, because he just posted an article about some notable successes that he initially greeted with “That will never work”.
I’m also reminded of Farnsworth’s breakthrough with the invention of television. Several competing labs were trying to invent a working television; Farnsworth had a small shop with minimal funding. One of the critical hurdles was a seemingly impossible bit of glasswork for the camera tube. Farnsworth’s self-taught brother-in-law eventually succeeded at this task in part because he didn’t know any better and tried something that shouldn’t have worked. (At least that’s how I remember the version of the story told in Sorkin’s The Farnsworth Invention.)
All of which begs the semi-kidding question: are we doing something impossible enough?
I’ve been looking at externalator from the perspective of a tough-but-fair parent that hasn’t seen his child in some time. I realized that if we had asked from externalator what we should have, it would have been much more robust and easy to maintain software. Not that its horrible, but its suffered from….well, not quite scope creep, but up-front design creep.
When externalator was conceived, we decided that one of its jobs was to create a bundle. What is a bundle? Now we’d almost definately say “a bundle is a svn location with svn:externals set”. This isn’t what we said at the time. We said it had to have svn:externals as well as a description file and maybe some other tag that said it was an externalator bundle. This seemed to make sense at the time, as we were having trouble keeping track of bundles. Now, I say WTF? We just limited the places that externalator can operate and increased the initial work to get it running. “Are you sure you want to move left?”
So ‘externalator import’ can go. description files … a good idea, but not externalator’s job.
At the time, we had some cvs externals and .tgz externals. These *seemed* to live with the bundle. We had some annoying scripts that fetched them at bundle-level, after all. So….externalator has CVS and TGZ externals. Which is kinda cool. But it wasn’t done in a unified way. And when all was said and done, no one used the different kinds of externals. So you’ve paid for much additional complexity for a cool but half-flushed feature that isn’t used and therefore isn’t maintained.
externalator could have been a very simple program. It still can. But I guess my warning is trying to design software that does multiple things. Do one thing. Do it well. Do another thing. The swiss army knife approach went out in the 80s (if not before).
There have been complaints — valid complaints — about difficulties dealing with our stack, and debugging errors in the system. The response I would offer is that we need to debug the debuggabilty of these systems. We could try to change architecture and keep our code more monolithic so everything shows up in a single stack trace, but I see that as a dangerous kind of overreaction; a panicky response where we overestimate the problems we know about and underestimate the problems we don’t know about.
If it is difficult to deal with the stack, then we should fix those problems. If it’s too hard to debug, that is itself a bug.
Here are some general things I think we could do to improve our system:
- More ubiquitous use of logging (with logging).
- Aggregate log viewers, so we can see all the messages that are generated by any part of the system. Because there are sometimes side effects in other parts of our stack outside of the application you are interacting with (especially with the introduction of Cabochon) we need see everything that is going on.
- Soft failures, well logged. Also more warnings. Everyone is tired of all the warnings that come out of Zope, but that’s just because it’s not our code that is causing the warnings.
- More assertions in code. Exceptions should be raised as soon as possible. Type assertions are okay, especially for things like strings and numbers where there aren’t generally other types you could usefully use to replace them. Generally immutable objects without substantial behavior, like strings, integers, in some contexts lists, tuples, and dicts, do not benefit from polymorphism and sometimes potential polymorphism is just a bug waiting to happen. (We shouldn’t put in checks everywhere, but if a bug occurs we should put in a check even if the actual bug is elsewhere in the system.)
- Error messages should be helpful and include sufficient information to understand the error. Bad or unhelpful error messages are a bug. So if you ever get an exception from some code like “assert x is None”, you should change it to “assert x is None, (’x should be None (not %r)’ % x)”.
- Centralize error reporting, so that errors in any part of the system are consistently presented.
- Up-front checking of configuration parameters for general sanity. Any configuration error that we make is something we should check for. It doesn’t matter how stupid or misinformed the configuration mistake seems, if someone made a mistake we should add a check for that mistake.
- Greater use of code analysis tools (like PyChecker, pyflakes, jslint) and putting together the scripts to make this checking easy (perhaps part of the fassembler builds). These can’t find lots of bugs, but they will usually find some bugs. And of course every valid bug found is one less bug.
- Make more use of documentation tools, and document the core parts of our system that people need to understand. Apydia looks promising, though Epydoc would be fine too (and it’s older and more mature).
- When bugs are hard to debug, make a ticket for it. For example, does fassembler fail in a weird way? Even if you figure out why and fix it (e.g., some weird system dependency), ticket it up anyway. It probably will always fail, but it should fail in an unweird way.
- When you are debugging code, try to put in debugging code that you can leave in. For instance, use well-structured logging messages instead of print statements or pdb.
- Make sure all objects have useful __repr__ methods. Include a little extra code in __repr__ to show the interesting structure of the object, so that the data doesn’t become overwhelming (e.g., sometimes leaving out attribute values if the value is None).
- If you encounter an error like an attribute being set to the wrong kind of value, consider replacing it with a property with assertions. Automated testing is nice, but runtime checks with simple tests that exercise functionality are better. Runtime checks happen during simple exercise tests, and also in actual production code, and potentially during other tests that you might not have realized exercise that code.
- When writing new code avoid being needlessly permissive. For instance, if you are retrieving a value from a dictionary you can use a_dict[key] or a_dict.get(key). If you aren’t sure which one to use, use a_dict[key] (that’s the more attractive syntax for a reason). Of course if there’s actually a reason to use a_dict.get(key) then use that.
- If you ever put in a hack, always always always put in a comment describing why it’s there and what its purpose is. Later people should know that there’s something fishy there, so they can think harder while reading that particular code.
If we make these improvements in the stack and adhere to these rules while writing and maintaining code I think we’ll all have a much more fulfilling development experience.